August, 2006

  • The Old New Thing

    A look inside WinInet's index.dat file and changes in IE7 and Vista

    • 5 Comments

    My frequent bicycling buddy Ari Pernick wrote a couple of articles over on the Windows Network Development blog on the topic of the index.dat file, which appears to have gotten a bit of attention lately.

    This past weekend, I joined Ari and another friend in a ride along the Upper Loop of the annual Tour De Peaks bicycle ride. I'd never done this ride before; I tend to do the same routes over and over. This doesn't bother me like it does other people. We had originally planned to do both the Upper and Lower loops (50km each for a total of 100km), and Ari chatted with one of the ride organizers about the characteristics of the two routes in order to decide which one to take first. During the discussion, the gentleman mentioned the "food committee". That reminded me that one of Tour De Peaks' claims to fame is that it has the best food of any Northwest bicycle ride. We ended up abandoning after the first half, because both Ari and I had other things we needed to get done that weren't on our schedule when we originally signed up for the ride. But we did have the food, and it lived up to the hype. The breakfast table included fresh fruit, juice, organic coffee, muffins, and mini-bagels; lunch included pasta, potato salad, salmon quesadillas, sandwiches, pizza, brownies, and caramel popcorn. Sure beats a bottle of water and a Clif Bar. Highly recommended.

    More than once, somebody pointed out to me that the cap was missing from my water bottle. I do that on purpose. After taking a few gulps, I tuck the open bottle in my back pocket. (Bicycling shirts have pockets on the back for convenience. I keep the water there instead of in the bottle cage since most bottles not specifically designed for it are too small for the cage and end up rattling around and eventually falling out.) That way, when I want a drink of water, I can just reach back, grab a few mouthfuls, and tuck it back into my shirt pocket—I don't like drinking from those sport bottle tops. The water delivery rate is just too slow!

  • The Old New Thing

    Candidate for most obscure keyboard shortcut: Shift+F8

    • 28 Comments

    One of the most obscure keyboard shortcuts has got to be Shift+F8, which is used for listbox discontiguous extended selection. Man, what a mouthful. KB article Q301583 doesn't help matters by listing this keyboard shortcut under "Dialog box keyboard shortcuts" even though it isn't a dialog box keyboard shortcut. It's a listbox keyboard shortcut.

    If the listbox supports extended selections (via LBS_EXTENDEDSEL), then you can use the Shift+F8 shortcut to create discontiguous multiple selections from the keyboard. (Via the mouse, you can just Ctrl+Click to create a discontiguous multiple selection.) Type Shift+F8 once to enter extended selection mode, then use the arrow keys to select an item and press Ctrl+Space or Shift+Space to select (or deselect) it. When finished, type Shift+F8 again (or just move focus to some other window).

    And yes, this particular keyboard interface is pretty wacked out. A more natural mechanism would be to have Ctrl+Arrow move the focus without changing the selection, then using Ctrl+Space to select (or deselect) the focus item. Thankfully, the list view control went for that approach rather than emulating the crazy Shift+F8 keyboard shortcut.

  • The Old New Thing

    Even more about C# anonymous methods, from the source

    • 81 Comments

    If you want to know still more about C# anonymous methods, you can check out the web site of Grant Richins who has an entire category devoted to anonymous methods, and he should know, since he actually implemented them.

    Now that CLR week is over, I'm curious what you all thought of it. Would you like to see another CLR week at some point? Should I stick to Win32? (Or doesn't it matter because I'm an arrogant Microsoft apologist either way?)

  • The Old New Thing

    The day Tully's ran out of coffee

    • 17 Comments

    Today, Tully's coffee shops begin offering free Wi-Fi (in shops where Wi-Fi is available). Tully's isn't as widespread as Starbucks, but it's the best of the major chain coffee shops in the Seattle area, according to a highly unscientific poll of my friends.

    Seeing Tully's name back in the news reminded me of an incident that occurred to one of my colleagues, who went into the local Tully's one evening and asked for a cup of drip coffee. The Tully's employee went to the pot, then turned around and said, "Actually, we don't have any coffee."

    My colleague did get a profuse apology from the district manager, who mentioned that the company policy is that the stores are always to have fresh, hot coffee available during all hours of operation. And as far as I know, every time since then, when my colleague stopped by for an evening cup of coffee, they had it ready.

    But I love the quote.

  • The Old New Thing

    The implementation of anonymous methods in C# and its consequences (part 3)

    • 38 Comments

    Last time we saw how the implementation details of anonymous methods can make themselves visible when you start taking a delegate apart by looking at its Target and Method. This time, we'll see how an innocuous code change can result in disaster due to anonymous methods.

    Occasionally, I see people arguing over where local variables should be declared. The "decentralists" believe that variables should be declared as close to their point of first use as possible:

    void MyFunc1()
    {
     ...
     for (int i = 0; i < 10; i++) {
      string s = i.ToString();
      ...
     }
     ...
    }
    

    On the other hand, the "consolidators" believe that local variables should be declared outside of loops.

    void MyFunc2()
    {
     ...
     string s;
     for (int i = 0; i < 10; i++) {
      s = i.ToString();
      ...
     }
     ...
    }
    

    The "consolidators" argue that hoisting the variable s means that the compiler only has to create the variable once, at function entry, rather than each time through the loop.

    As a result, you can find yourself caught in a struggle between the "decentralists" and the "consolidators" as members of each school touch a piece of code and "fix" the local variable declarations to suit their style.

    And then there are the "peacemakers" who step in and say, "Look, it doesn't matter. Can't we all just get along?"

    While I admire the desire to have everyone get along, the claim that it doesn't matter is unfortunately not always true. Let's stick some nasty code in where the dots are:

    delegate void MyDelegate();
    void MyFunc1()
    {
     MyDelegate d = null;
     for (int i = 0; i < 10; i++) {
      string s = i.ToString();
      d += delegate() {
       System.Console.WriteLine(s);
      };
     }
     d();
    }
    

    Since the s variable is declared inside the loop, each iteration of the loop gets its own copy of s, which means that each delegate gets its own copy of s. The first time through the loop, an s is created with the value "0" and that s is used by the first delegate. The second time through the loop, a new s is created with the value "1", and that new s is used by the second delegate. The result of this code fragment is ten delegates, each of which prints a different number from 0 to 9.

    Now, a "consolidator" looks at this code and says, "How inefficient, creating a new s each time through the loop. I shall hoist it and bask in the accolades of my countrymen."

    delegate void MyDelegate();
    void MyFunc2()
    {
     MyDelegate d = null;
     string s;
     for (int i = 0; i < 10; i++) {
      s = i.ToString();
      d += delegate() {
       System.Console.WriteLine(s);
      };
     }
     d();
    }
    

    If you run this fragment, you get different behavior. A single s variable is created for all the loop iterations to share. The first time through the loop, the value of s is "0", and then the first delegate is created. The second loop iteration changes the value of s to "1" before creating the second delegate. Repeat for the remaining eight delegates, and at the end of the loop, the value of s is "9", and ten delegates have been added to d. When d is invoked, all the delegates print the value of the s variable, which they are sharing and which has the value "9". The result: 9 is printed ten times.

    Now, I happen to have constructed this scenario to make the "consolidators" look bad, but I could also have written it to make the "decentralists" look bad for pushing a variable declaration into a loop scope when it should have remained outside. (All you have to do is read the above scenario in reverse.)

    The point of this little exercise is that when a "consolidator" or a "decentralist" goes through an entire program "tuning up" the declarations of local variables, the result can be a broken program, even though the person making the change was convinced that their change "had no effect; I was just making the code prettier / more efficient".

    What's the conclusion here?

    Write what you mean and mean what you write. If the precise scope of a variable is important, make sure to comment it as such so that somebody won't mess it up in a "clean-up" pass over your program. If there are two ways of writing the same thing, then write the one that is more maintainable. And if you feel that one method is superior from a performance point of view, then (1) make sure you're right, and (2) make sure it matters.

  • The Old New Thing

    News Flash: Big houses also cost more to cool

    • 5 Comments

    Perhaps as a counterpart to the fact that big houses have bigger heating bills, NPR yesterday pointed out that bigger houses use more electricity for cooling. (NPR looks not at the "surprise" of big-house owners over the cost of energy, but rather the consequences of these big houses on the energy grid. But the headline was hard to pass up.)

  • The Old New Thing

    The implementation of anonymous methods in C# and its consequences (part 2)

    • 15 Comments

    Last time we took a look at how anonymous methods are implemented. Today we'll look at a puzzle that can be solved with what we've learned. Consider the following program fragment:

    using System;
    
    class MyClass {
     delegate void DelegateA();
     delegate void DelegateB();
    
     static DelegateB ConvertDelegate(DelegateA d)
     {
      return (DelegateB)
       Delegate.CreateDelegate(typeof(DelegateB), d.Method);
     }
    
     static public void Main()
     {
      int i = 0;
      ConvertDelegate(delegate { Console.WriteLine(0); });
     }
    }
    

    The ConvertDelegate method merely converts a DelegateA to a DelegateB by creating a DelegateB with the same underlying method. Since the two delegate types use the same signature, this conversion goes off without a hitch.

    But now let's make a small change to that Main function:

     static public void Main()
     {
      int i = 0;
      // one character change - 0 becomes i
      ConvertDelegate(delegate { Console.WriteLine(i); });
     }
    

    Now the program crashes with a System.ArgumentException at the point where we try to create the DelegateB. What's going on?

    First, observe that the overload of Delegate.CreateDelegate that was used is one that can only be used to create delegates from static methods. Next, note that in Test1, the anonymous method references neither its own members nor any local variables from its lexically-enclosing method. Therefore, the resulting anonymous method is a "static anonymous method of the easy type". Since the anonymous method is a static member, the use of the "static members only" overload of Delegate.CreateDelegate succeeds.

    However, in Test2, the anonymous method dereferences the i variable from its lexically-enclosing method. This forces the anonymous method to be a "anonymous method of the hard type", and those anonymous methods use an anonymous instance member function of an anonymous helper class. As a result, d.Method is an instance method, and the chosen overload of Delegate.CreateDelegate throws an invalid parameter exception since it works only with static methods.

    The solution is to use a different overload of Delegate.CreateDelegate, one that work with either static or instance member functions.

     DelegateB ConvertDelegate(DelegateA d)
     {
      return (DelegateB)
       Delegate.CreateDelegate(typeof(DelegateB), d.Target, d.Method);
     }
    

    The Delegate.CreateDelegate(Type, Object, MethodInfo) overload creates a delegate for a static method if the Object parameter is null or a delegate for an instance method if the Object parameter is non-null. Hardly by coincidence, that is exactly what d.Target produces. If the original delegate is for a static method, then d.Target is null; otherwise, it is the object for which the instance method is to be invoked on.

    This fix, therefore, makes the ConvertDelegate function handle conversion of delegates for either static or instance methods. Which is a good thing, because it may now be called upon to convert delegates for instance methods as well as static ones.

    Okay, this time we were lucky that the hidden gotcha of anonymous methods resulted in an exception. Next time, we'll see a gotcha that merely results in incorrect behavior that will probably take you forever to track down.

  • The Old New Thing

    The implementation of anonymous methods in C# and its consequences (part 1)

    • 50 Comments

    You may not even have realized that there are two types of anonymous methods. I'll call them the easy kind and the hard kind, not because they're actually easy and hard for you the programmer, but because they are easy and hard for the compiler.

    The easy kind is the anonymous method that doesn't use any local variables from its lexically-enclosing method. These are anonymous methods that could have been their own separate member functions; all the anonymization does is save you the trouble of coming up with names for them:

    class MyClass1 {
     int v = 0;
     delegate void MyDelegate(string s);
    
     MyDelegate MemberFunc()
     {
      int i = 1;
      return delegate(string s) {
              System.Console.WriteLine(s);
             };
      }
    }
    

    This particular anonymous method doesn't access any MyClass1 members, nor does it access the local variables of the MemberFunc function; therefore, it can be converted to a static method of the MyClass1 class:

    class MyClass1_converted {
     int v = 0;
     delegate void MyDelegate(string s);
    
     // Autogenerated by the compiler
     static void __AnonymousMethod$0(string s)
     {
      System.Console.WriteLine(s);
     }
    
     MyDelegate MemberFunc()
     {
      int i = 1;
      return __AnonymousMethod$0;
      // which is in turn shorthand for
      // return new MyDelegate(MyClass1.__AnonymousMethod$0);
      }
    }
    

    All the compiler did was give your anonymous methods a name and use that name in place of the "delegate (...) { ... }". (Note that all compiler-generated names I use here are purely illustrative. The actual compiler-generated name will be something different.)

    On the other hand, if your anonymous method used the this parameter, then that makes it an instance method instead of a static method:

    class MyClass2 {
     int v = 0;
     delegate void MyDelegate(string s);
    
     MyDelegate MemberFunc()
     {
      int i = 1;
      return delegate(string s) {
              System.Console.WriteLine("{0} {1}", v, s);
             };
      }
    }
    

    The anonymous method in MyClass2 uses the this keyword implicitly (to access the member variable v). Therefore, the conversion is to an instance member rather than to a static member.

    class MyClass2_converted {
     int v = 0;
     delegate void MyDelegate(string s);
    
     // Autogenerated by the compiler
     void __AnonymousMethod$0(string s)
     {
      System.Console.WriteLine("{0} {1}", v, s);
     }
    
     MyDelegate MemberFunc()
     {
      int i = 1;
      return this.__AnonymousMethod$0;
      // which is in turn shorthand for
      // return new MyDelegate(this.__AnonymousMethod$0);
      }
    }
    

    So far, we've only dealt with the easy cases. The transformation is local and not particularly complicated. These are the sorts of transformations you could make yourself without too much difficulty in the absence of anonymous methods.

    The hard case is where things get interesting. The body of an anonymous method is permitted to access the local variables of its lexically-enclosing method, in which case the compiler needs to keep those variables alive so that the body of your anonymous method can access them. Here's a sample anonymous method that accesses local variables from its lexically-enclosing method:

    class MyClass3 {
     int v = 0;
     delegate void MyDelegate(string s);
    
     MyDelegate MemberFunc()
     {
      int i = 1;
      return delegate(string s) {
              System.Console.WriteLine("{0} {1} {2}", i++, v, s);
             };
      }
    }
    

    In this example, the anonymous method prints "1 v s" the first time it is called, then "2 v s" the second time it is called, and so on, with the integer increasing by one. (And where v s are the current values of v and s, of course.) This happens because the i variable that the anonymous method is accessing is the same one each time, and it's the same i that the MemberFunc method was using, too. If the function were rewritten as

    class MyClass4 {
     int v = 0;
     delegate void MyDelegate(string s);
    
     MyDelegate MemberFunc()
     {
      int i = 0;
      MyDelegate d = delegate(string s) {
              System.Console.WriteLine("{0} {1} {2}", i++, v, s);
             };
      i = 1;
      return d;
      }
    }
    

    the behavior would be the same as in MyClass3. The creation of the delegate from the anonymous method does not make a copy of the i variable; changes to the i variable in the MemberFunc are visible to the anonymous method because both are accessing the same variable.

    When faced with this "hard" type of anonymous method, wherein variables are shared with the lexically-enclosing method, the compiler generates a helper class:

    class MyClass3_converted {
     int v = 0;
     delegate void MyDelegate(string s);
    
     // Autogenerated by the compiler
     class __AnonymousClass$0 {
      MyClass this$0;
      int i;
      public void __AnonymousMethod$0(string s)
      {
        System.Console.WriteLine("{0} {1} {2}", i++, this$0.v, s);
      }
     }
    
     MyDelegate MemberFunc()
     {
      __AnonymousClass$0 locals$ = new __AnonymousClass$0();
      locals$.this$0 = this;
      locals$.i = 0;
      return locals$.__AnonymousMethod$0;
      // which is in turn shorthand for
      // return new MyDelegate(locals$.__AnonymousMethod$0);
      }
    }
    

    Wow, there was a lot of rewriting this time. A helper class was created to contain the local variables that were shared between the MemberFunc function and the anonymous method (in this case, just the variable i), as well as the hidden this parameter (which I have called this$). In the MemberFunc function, access to that shared variable is done through this anonymous class, and the anonymous method that you wrote is an anonymous method on the anonymous class.

    Notice that the assignment to i in MemberFunc modifies the copy inside locals$, which is the same object that the anonymous method will be using when it runs. That's why it prints "1 v s" the first time: The value had already been changed to 1 by the time the delegate ran for the first time.

    Those who have done a good amount of C++ programming (or C# 1.0 programming) are well familiar with this technique, since C++ callbacks typically are given only one context variable; that context variable is usually a pointer to a larger structure that contains all the complex context you really want to operate on. C# 1.0 programmers went through a similar exercise. The "hard" type of anonymous method provides syntactic sugar that saves you the hassle of having to declare and manage the helper class.

    If you thought about it some, you'd have realized that the way it's done is pretty much the only way it could have been done. It turns out that most computer programming doesn't consist of being clever or making hard decisions. You just have one kernel of an idea ("hey let's have anonymous methods") and then the rest is just doing what has to be done, no actual decisions needed. You just do the obvious thing. Most programming consists of just doing the obvious thing.

    Okay, so that's a quick introduction to the implementation of anonymous methods in C#. Mind you, this information isn't just for your personal edification. It's actually important that you understand how these works (and not just treat it as "magic"), because lack of said understanding can lead to subtle programming errors. We'll look at those types of errors over the next few days.

  • The Old New Thing

    C# nested classes are like C++ nested classes, not Java inner classes

    • 24 Comments

    When you declare a class inside another class, the inner class still acts like a regular class. The nesting controls access and visibility, but not behavior. In other words, all the rules you learned about regular classes also apply to nested classes.

    The this keyword in an instance methods of a class (nested or not) can be used to access members of that class and only those members. It cannot be used to access members of other classes, at least not directly. (And the this can be omitted when it would not result in ambiguity.) You create an instance of a class (nested or not) by saying new ClassName(...) where ... are the parameters to an applicable class constructor.

    Java nested classes behave the same way, but Java also has the concept of inner classes. To construct an instance of an inner class in Java, you write new o.InnerClass(...) where ... as before are the parameters to an applicable class constructor. The o in front is an expression that evaluates to an object whose type is that of the outer class. The inner class can then use the this keyword to access its own members as well as those of the instance of the outer class to which it was bound.

    In C++ and C#, you will have to implement this effect manually. It's not hard, though:

    // Java
    class OuterClass {
     string s;
     // ...
     class InnerClass {
    
      public InnerClass() { }
      public string GetOuterString() { return s; }
     }
     void SomeFunction() {
      InnerClass i = new this.InnerClass();
      i.GetOuterString();
     }
    }
    
    // C#
    class OuterClass {
     string s;
     // ...
     class InnerClass {
      OuterClass o_;
      public InnerClass(OuterClass o) { o_ = o; }
      public string GetOuterString() { return o_.s; }
     }
     void SomeFunction() {
      InnerClass i = new InnerClass(this);
      i.GetOuterString();
     }
    }
    

    In Java, the inner class has a secret this$0 member which remembers the instance of the outer class to which it was bound. Creating an instance of an inner class via the new o.InnerClass(...) notation is treated as if you had written new InnerClass(o, ...), where o is automatically assigned to the secret this$0 member, and attempts to access members of the outer class are automatically treated as if they were written this$0.outermember. (This description of how inner classes are implemented is not just conceptual. It is spelled out in the language specification.)

    The C# equivalent to this code merely makes explicit the transformation that in Java was implicit. We give the inner class a reference to the outer class (here, we called it o_) and pass it as an explicit parameter to the inner class's constructor. And when we want to access a member of that outer class, we use o_ to do it.

    In other words, Java inner classes are syntactic sugar that is not available to C#. In C#, you have to do it manually.

    If you want, you can create your own sugar:

    class OuterClass {
     ...
     InnerClass NewInnerClass() {
      return new InnerClass(this);
     }
     void SomeFunction() {
      InnerClass i = this.NewInnerClass();
      i.GetOuterString();
     }
    }
    

    Where you would want to write in Java new o.InnerClass(...) you can write in C# either o.NewInnerClass(...) or new InnerClass(o, ...). Yes, it's just a bunch of moving the word new around. Like I said, it's just sugar.

    Now, I'm not saying that the Java way of representing inner classes isn't useful. It's a very nice piece of sugar if you access the outer class's members frequently from the inner class. However, it's not the type of transformation that makes you say, "Well, if a language doesn't support this, it's too hard for me to implement it manually, so I'll just give up." The conversion is not that complicated and consists entirely of local changes that can be performed without requiring a lot of thought.

    As a postscript, my colleague Eric Lippert points out that JScript.NET does have instance-bound inner classes.

    class Outer {
     var s;
     class Inner {
      function GetOuterString() {
       return s;
      }
     }
    }
    
    var o = new Outer();
    o.s = "hi";
    var i = new o.Inner();
    i.GetOuterString();
    
Page 4 of 4 (39 items) 1234