• The Old New Thing

    Why does the primary monitor have (0,0) as its upper left coordinate?

    • 31 Comments

    By definition, the primary monitor is the monitor that has (0,0) as its upper left corner. Why can't the primary monitor be positioned somewhere else?

    Well, sure you could do that, but then you'd have to invent a new name for the monitor whose upper left corner is at (0,0), and then you're back where you started.

    In other words, it's just a name. You could ask, "Why can't starboard be on the left-hand side of the boat?" Well, sure you could do that, but then you'd have to come up with a new name for the right-hand side of the boat, and then things are pretty much the same as they were, just with different names.

    Perhaps a more descriptive (but clumsier) name for the primary monitor would be the backward-compatibility monitor (for applications which do not support multiple monitors), because that's what the primary monitor is. If an application is not multiple-monitor aware, then any time it asks for properties of "the" monitor, it gets information about the backward-compatibility monitor. A call to GetSystemMetrics(SM_CXSCREEN) gives the width of the backward-compatibility monitor, GetSystemMetrics(SM_CYMAXIMIZED) gives the height of a window maximized on the backward-compatibility monitor, and positioning a window at (0,0) puts it at the upper left corner of the backward-compatibility monitor.

    The window manager still respects window coordinates passed to functions like CreateWindow or SetWindowPos. If you pass coordinates that are on a secondary monitor—oops—a monitor different from the backward-compatibility monitor, then the window manager will happily put the window there. These coordinates might be the result of a program that is multiple-monitor aware, or possibly merely from a program which is multiple-monitor agnostic.

    Multiple-monitor agnosticism is a term I just made up which refers to programs which might not explicitly support multiple monitors, but at least were open to the possibility of multiple monitors by not making assumptions about the number of monitors but instead using functions like RectVisible to determine what the visible portions of the screen are. These techniques were hot topics many years ago when you wanted to write a program that ran both on single-monitor-only versions of Windows as well as multiple-monitor versions of windows; nowadays there are rather old-fashioned, like coming up with mnemonics for all your friends' telephone numbers so you didn't have to keep looking them up. (Today, you just go ahead and call the multiple monitor functions and use the address book function in your mobile phone to remember your friends' phone numbers.)

    It is not the case that the primary monitor is the applications show up here first monitor. As noted earlier, applications show up on whatever monitor they ask for, whether they asked for it explicitly (hunting around for a monitor and using it) or implicitly (restoring to the same coordinates they were on when they were last run).

    Of course, programs which pass CW_USEDEFAULT to the CreateWindow function explicitly abdicated the choice of the window position and therefore the monitor. In that case, the window manager tries to guess an appropriate monitor. If the new window has a parent or owner, then it is placed on the same monitor as that parent or owner; otherwise, the window manager just puts the window on the backward-compatible monitor, for lack of a better idea.

  • The Old New Thing

    I challenge you to come up with an even lamer physics pun

    • 42 Comments

    The other day, I was in the office kitchenette, and two of my colleagues both named Paul happened to be there getting coffee. I quipped, "Oh no, is this legal? I think it's a violation of the Paul Exclusion Principle."

    It was a horrible physics pun, perhaps one of the worst I've made in a long time. My challenge to you is to come up with an even worse one that you've told.

    Note: You have to have actually made the pun to an appropriate audience. No fair just making one up for the purpose of the challenge.

  • The Old New Thing

    How do I get the Explorer navigation pane to highlight the current folder all the time?

    • 28 Comments

    In Windows 7, the folder tree in the Explorer navigation pane by default no longer highlights the item in the view pane. This change was based on user testing and feedback, but if, like me, you prefer things the old way, you can play with two new check boxes on the Folder Options dialog. You can get to Folder Options in a variety of ways:

    • From the Control Panel, go to Appearance and Personalization → Folder Options. (Or just type Folder Options into the Control Panel search box or the Start menu search box to go straight there.)
    • From the Explorer menu bar, select Tools → Folder options.
    • From the Explorer command bar, select Organize → Folder and search options.
    • Or you can exercise your super élite status and just right-click on a blank space in the navigation pane.

    However you wind up there, the item you want to turn on is Automatically expand to current folder (or Expand to current folder if you use the super élite method).

  • The Old New Thing

    Microspeak: The funnel

    • 11 Comments

    In the Customer Service and Support part of Microsoft, you will often see the term funnel. Here are some citations:

    Effectively and efficiently solve issues by driving levers across the entire funnel.
    Putting the Fun in Funnel.
    Strengthening the front of the funnel.

    The funnel is a way of viewing customer support engagements. For some reason, the funnel diagram is always drawn on its side with the mouth (the fat part) on the left and the stem (the narrow part) on the right. The width of the funnel represents the volume of customers at that stage of the support process, and the progress through the funnel represents how much time the customer has spent working on a solution.

    At the mouth of the funnel are the customers who turn to built-in product help, online help, forums, Knowledge Base articles, blog entries, training materials, and similar resources. A significant percentage of customers get the help they need via self-help, where the solution to their problem existed even before they asked the question; they just had to find it.

    The funnel narrows, and the customer picks up the phone or otherwise initiates a support incident. A support technician helps the customer via email, live chat, phone call, whatever. Another percentage of customers get their problem solved at this stage. It took longer, but the problem did get solved.

    At the stem of the funnel are the customers whose problems remain unresolved, and now things get bad. The problem takes days to resolve, multiple engineers get involved, and maybe even a site visit is called for.

    There is a concerted effort to improve the support resources at the front of the funnel. Of course, there are efforts to improve the support process at all of the stages, but the front of the funnel is a particular area of focus, since that's where everybody starts out, and that's where most users get their solutions. The term front of the funnel is in such heavy use it even gets its own acronym: FoF. Is it pronounced foff? Beats me.

  • The Old New Thing

    What was that story about the WinHelp pen-writing-in-book animation?

    • 47 Comments

    The first time you open a WinHelp file, you get this pen-writing-in-book animation while WinHelp does um something which it passes off as preparing Help file for first use or something similarly vague.

    I remember a conversation about that animation. The Windows shell team suggested to the author of WinHelp that the program use the shell common animation control to display that animation. After all, it's a short animation and it met the requirements for the animation common control. But the author of WinHelp rejected the notion.

    (Pre-emptive "I can't believe I had to write this": This conversation has been exaggerated for effect.)

    "Your animation control is so huge and bloated. I can do it much smaller and faster myself. The RLE animation format generates frames by re-rendering the pixels that have changed, which means that at each frame of the animation, a new pen image would be recorded in the AVI file. The pen cycles through three different orientations at each location, there are ten locations on each row, and there are four rows. If I used an RLE animation, that'd be 3 × 10 × 4 = 120 copies of the pen bitmap. Instead, I have just three pen bitmaps, and I manually draw them at the appropriate location for each frame. Something like this:

    // NOTE: For simplicity, I'm ignoring the "turn the page" animation
    void DrawFrame(int frame)
    {
      // Calculate our position in the animation
      int penframe = frame % 3; // 3 pen images per location
      int column = (frame / 3) % 10; // 10 columns per row
      int row = (frame / 30) % 4; // 4 rows
      int i;
      POINT pt;
    
      DrawBlankPage(0, 0); // start with a blank sheet of paper
    
      // Draw the "text" that the pen "wrote" in earlier rows
      for (i = 0; i < row; i++) {
        DrawTextScribble(i, 0, 9);
      }
    
      // Draw the partially-completed row that the pen is on now
      DrawTextScribble(row, 0, column);
    
      // Position the pen image so the pen tip hits the "draw" point
      GetTextScribblePoint(column, row, &pt);
      DrawPenBitmap(penBitmaps[penframe], pt.x - 1, pt.y - 5);
    }
    

    "See? In just a few lines of code, I have a complete animation. All I needed was the three pen images and a background bitmap showing a book opened to a blank page. This is way more efficient both in terms of memory and execution time than your stupid animation common control. You shell guys could learn a thing or two about programming."

    "Okay, fine, don't need to get all defensive about it. We were just making a suggestion, that's all."

    Time passes, and Windows 95 is sent off for translation into the however many languages it is localized for. A message comes in from some of the localization teams. It seems that some locales need to change the animation. For example, the Arabic version of Windows needs the pen to write on the left-hand pages, the pen motion should be right to left, and the pages need to flip from left to right. Perhaps the Japanese translators are okay with the pen motion, but they want the pages to flip from left to right.

    The localization team contacted the WinHelp author. "We're trying to change the animation, but we can't find the AVI file in the resources. Can you advise us on how we should localize the animation?"

    Unfortunately, the WinHelp author had to tell the localization team that the direction of pen motion, and the locations of the ink marks are hard-coded into the program. Since the product had already passed code lockdown, there was nothing that could be done. WinHelp shipped with a pen that moved in the wrong direction in some locales.

    Moral of the story: There's more to software development than programming for performance. Today we learned about localizability.

  • The Old New Thing

    What happened to WinHelp?

    • 27 Comments

    Commenter winhelp (probably not his/her real name) wonders what happened to WinHelp.exe.

    I don't know, but it turns out the answer was already known to the Internet. At the time the question was posted, the answer was already in the Wikipedia entry for Windows Help—it even had a citation!

    The question does highlight another one of those no matter what you do, somebody will call you an idiot dilemmas. On the one side, we have "Windows is already so big, what's the harm in adding another megabyte to the size to add this feature that is rarely used, primarly by older applications, so that customers won't have to download it?" On the other side, we have "Windows is too big, why not get rid of the components that exist only for the benefit of older applications and make them optional downloads?"

    What probably swung the pendulum to the remove it from the core product side is the fact that the Windows help file format is equivalent to an EXE. (I don't know this personally; I'm just reading the Wikipedia article.) If somebody can trick you into clicking on a rogue HLP file, they can run arbitrary code and take over your account. The underlying functionality is useful, because you can write help files with links like Click here to open the Options dialog, and clicking the link will actually open the Options dialog (by invoking some accompanying native code that calls whatever APIs are necessary to get that Options dialog to open).

    WinHelp came from the days before the Internet, when HyperCard was the reigning champion for page-based information presentation. You didn't have to worry that double-clicking a file on a remote server might take over your computer because you couldn't contact remote servers in the first place! (And if you could, it was because you were on a local-area network where all the computers were operated by your co-workers or other people you trusted.)

    As I recall, there are some help-file-based viruses out there, so the security aspect is not merely a theoretical discussion. Removing the attack surface from the default configuration reduces the value of the help file attack. (Historians may note that HyperCard also permitted execution of arbitrary native code attached to a HyperCard deck. There were also HyperCard viruses.)

    But now that you mention WinHelp, I remember a story about the little pen-writing-in-book animation that appears when the help engine is "preparing Help file for first use" (whatever that means). I'll take that up tomorrow.

  • The Old New Thing

    When do I need to use GC.KeepAlive?

    • 34 Comments

    Finalization is the crazy wildcard in garbage collection. It operates "behind the GC", running after the GC has declared an object dead. Think about it: Finalizers run on objects that have no active references. How can this be a reference to an object that has no references? That's just crazy-talk!

    Finalizers are a Ouija board, permitting dead objects to operate "from beyond the grave" and affect live objects. As a result, when finalizers are involved, there is a lot of creepy spooky juju going on, and you need to tread very carefully, or your soul will become cursed.

    Let's step back and look at a different problem first. Consider this class which doesn't do anything interesting but works well enough for demonstration purposes:

    class Sample1 {
     private StreamReader sr;
     public Sample1(string file) : sr(new StreamReader(file)) { }
     public void Close() { sr.Close(); }
     public string NextLine() { return sr.ReadLine(); }
    }
    

    What happens if one thread calls Sample1.NextLine() and another thread calls Sample1.Close()? If the NextLine() call wins the race, then you have a stream closed while it is in the middle of its ReadLine method. Probably not good. If the Close() call wins the race, then when the NextLine() call is made, you end up reading from a closed stream. Definitely not good. Finally, if the NextLine() call runs to completion before the Close(), then the line is successfully read before the stream is closed.

    Having this race condition is clearly an unwanted state of affairs since the result is unpredictable.

    Now let's change the Close() method to a finalizer.

    class Sample2 {
     private StreamReader sr;
     public Sample2(string file) : sr(new StreamReader(file)) { }
     ~Sample2() { sr.Close(); }
     public string NextLine() { return sr.ReadLine(); }
    }
    

    Remember that we learned that an object becomes eligible for garbage collection when there are no active references to it, and that it can happen even while a method on the object is still active. Consider this function:

    string FirstLine(string fileName) {
     Sample2 s = new Sample2(fileName);
     return s.NextLine();
    }
    

    We learned that the Sample2 object becomes eligible for collection during the execution of NextLine(). Suppose that the garbage collector runs and collects the object while NextLine is still running. This could happen if ReadLine takes a long time, say, because the hard drive needs to spin up or there is a network hiccup; or it could happen just because it's not your lucky day and the garbage collector ran at just the wrong moment. Since this object has a finalizer, the finalizer runs before the memory is discarded, and the finalizer closes the StreamReader.

    Boom, we just hit the race condition we considered when we looked at Sample1: The stream was closed while it was being read from. The garbage collector is a rogue thread that closes the stream at a bad time. The problem occurs because the garbage collector doesn't know that the finalizer is going to make changes to other objects.

    Classically speaking, there are three conditions which in combination lead to this problem:

    1. Containment: An entity a retains a reference to another entity b.
    2. Incomplete encapsulation: The entity b is visible to an entity outside a.
    3. Propagation of destructive effect: Some operation performed on entity a has an effect on entity b which alters its proper usage (usually by rendering it useless).

    The first condition (containment) is something you do without a second's thought. If you look at any class, there's a very high chance that it has, among its fields, a reference to another object.

    The second condition (incomplete encapsulation) is also a common pattern. In particular, if b is an object with methods, it will be visible to itself.

    The third condition (propagation of destructive effect) is the tricky one. If an operation on entity a has a damaging effect on entity b, the code must be careful not to damage it while it's still being used. This is something you usually take care of explicitly, since you're the one who wrote the code that calls the destructive method.

    Unless the destructive method is a finalizer.

    If the destructive method is a finalizer, then you do not have complete control over when it will run. And it is one of the fundamental laws of the universe that events will occur at the worst possible time.

    Enter GC.KeepAlive(). The purpose of GC.KeepAlive() is to force the garbage collector to treat the object as still live, thereby preventing it from being collected, and thereby preventing the finalizer from running prematurely.

    (Here's the money sentence.) You need to use GC.KeepAlive when the finalizer for an object has a destructive effect on a contained object.

    The problem is that it's not always clear which objects have finalizers which have destructive effect on a contained object. There are some cases where you can suspect this is happening due to the nature of the object itself. For example, if the object manages something external to the CLR, then its finalizer will probably destroy the external object. But there can be other cases where the need for GC.KeepAlive is not obvious.

    A much cleaner solution than using GC.KeepAlive is to use the IDisposable interface, formalized by the using keyword. Everybody knows that the using keyword ensures that the object being used is disposed at the end of the block. But it's also the case (and it is this behavior that is important today) that the using keyword also keeps the object alive until the end of the block. (Why? Because the object needs to be alive so that we can call Dispose on it!)

    This is one of the reasons I don't like finalizers. Since they operate underneath the GC, they undermine many principles of garbage collected systems. (See also resurrection.) As we saw earlier, a correctly-written program cannot rely on side effects of a finalizer, so in theory all finalizers could be nop'd out without affecting correctness.

    The garbage collector purist in me also doesn't like finalizers because they prevent the running time of a garbage collector to be proportional to the amount of live data, like say in a classic two-space collector. (There is also a small constant associated with the amount of dead data, which means that the overall complexity is proportional to the amount of total data.)

    If I ruled the world, I would decree that the only thing you can do in a finalizer is perform some tests to ensure that all the associated external resources have already been explicitly released, and if not, raise a fatal exception: System.Exception.Resource­Leak.

    Bonus reading

  • The Old New Thing

    How can I find all objects of a particular type?

    • 28 Comments

    More than one customer has asked a question like this:

    I'm looking for a way to search for all instances of a particular type at runtime. My goal is to invoke a particular method on each of those instances. Note that I did not create these object myself or have any other access to them. Is this possible?

    Imagine what the world would be like if it were possible.

    For starters, just imagine the fun you could have if you could call typeof(Secure­String).Get­Instances(). Vegas road trip!

    More generally, it breaks the semantics of App­Domain boundaries, since grabbing all instances of a type lets you get objects from another App­Domain, which fundamentally violates the point of App­Domains. (Okay, you could repair this by saying that the Get­Instances method only returns objects from the current App­Domain.)

    This imaginary Get­Instances method might return objects which are awaiting finalization, which violates one of the fundamental assumptions of a finalizer, namely that there are no references to the object: If there were, then it wouldn't be finalized! (Okay, you could repair this by saying that the Get­Instances method does not return objects which are awaiting finalization.)

    On top of that, you break the syncRoot pattern.

    class Sample {
     private object syncRoot = new object();
     public void Method() {
      lock(syncRoot) { ... };
     }
    }
    
    If it were possible to get all objects of a particular class, then anybody could just reach in and grab your private sync­Root and call Monitor.Enter() on it. Congratuations, the private synchronization object you created is now a public one that anybody can screw with, defeating the whole purpose of having a private syncRoot. You can no longer reason about your syncRoot because you are no longer in full control of it. (Yes, this can already be done with reflection, but at least when reflecting, you know that you're grabbing somebody's private field called sync­Root, so you already recognize that you're doing something dubious. Whereas with Get­Instances, you don't know what each of the returned objects is being used for. Heck, you don't even know if it's being used! It might just be garbage lying around waiting to be collected.)

    More generally, code is often written on the expectation that an object that you never give out a reference to is not accessible to others. Consider the following code fragment:

    using (StreamWriter sr = new StreamWriter(fileName)) {
     sr.WriteLine("Hello");
    }
    

    If it were possible to get all objects of a particular class, you may find that your customers report that they are getting an Object­Disposed­Exception on the call to Write­Line. How is that possible? The disposal doesn't happen until the close-brace, right? Is there a bug in the CLR where it's disposing an object too soon?

    Nope, what happened is that some other thread did exactly what the customer was asking for a way to do: It grabbed all existing Stream­Writer instances and invoked Stream­Writer.Close on them. It did this immediately after you constructed the Stream­Writer and before you did your sr.Write­Line(). Result: When your sr.Write­Line() executes, it finds that the stream was already closed, and therefore the write fails.

    More generally, consider the graffiti you could inject into all output files by doing

    foreach (StreamWriter sr in typeof(StreamWriter).GetInstances()) {
     sr.Write("Kilroy was here!");
    }
    

    or even crazier

    foreach (StringBuilder rb in typeof(StringBuilder).GetInstances()) {
     sb.Insert(0, "DROP TABLE users; --");
    }
    

    Now no String­Builder is safe—the contents of any String­Builder can be corrupted at any time!

    If you could obtain all instances of a type, the fundamental logic behind computer programming breaks down. It effectively becomes impossible to reason about code because anything could happen to your objects at any time.

    If you need to be able to get all instances of a class, you need to add that functionality to the class itself. (GC­Handle or Weak­Reference will come in handy here.) Of course, if you do this, then you clearly opted into the "anything can happen to your object at any time outside your control" model and presumably your code operates accordingly. You made your bed; now you get to lie in it.

    (And I haven't even touched on thread safety.)

    Bonus reading: Questionable value of SyncRoot on Collections.

  • The Old New Thing

    How do I get the reference count of a CLR object?

    • 41 Comments

    A customer asked the rather enigmatic question (with no context):

    Is there a way to get the reference count of an object in .Net?

    Thanks,
    Bob Smith
    Senior Developer
    Contoso

    The CLR does not maintain reference counts, so there is no reference count to "get". The garbage collector only cares about whether an object has zero references or at least one reference. It doesn't care if there is one, two, twelve, or five hundred—from the point of view of the garbage collector, one is as good as five hundred.

    The customer replied,

    I am aware of that, yet the mechanism is somehow implemented by the GC...

    What I want to know is whether at a certain point there is more then one variable pointing to the same object.

    As already noted, the GC does not implement the "count the number of references to this object" algorithm. It only implements the "Is it definitely safe to reclaim the memory for his object?" algorithm. A null garbage collector always answers "No." A tracing collector looks for references, but it only cares whether it found one, not how many it found.

    The discussion of "variables pointing to the same objects" is somewhat confused, because you can have references to an object from things other than variables. Parameters to a method contain references, the implicit this is also a reference, and partially-evaluated expressions also contain references. (During execution of the line string s = o.ToString();, at the point immediately after o.ToString() returns and before the result is assigned to s, the string has an active reference but it isn't stored in any variable.) And as we saw earlier, merely storing a reference in a variable doesn't prevent the object from being collected.

    It's clear that this person solved half of his problem, and just needs help with the other half, the half that doesn't make any sense. (I like how he immediately weakened his request from "I want the exact reference count" to "I want to know if it is greater than one." Because as we all know, the best way to solve a problem is to reduce it to an even harder problem.)

    Another person used some psychic powers to figure out what the real problem is:

    If I am reading properly into what you mean, you may want to check out the Weak­Reference class. This lets you determine whether an object has been collected. Note that you don't get access to a reference count; it's a zero/nonzero thing. If the Weak­Reference is empty, it means the object has been collected. You don't get a chance to act upon it (as you would if you were the last one holding a reference to it).

    The customer explained that he tried Weak­Reference, but it didn't work. (By withholding this information, the customer made the mistake of not saying what he already tried and why it didn't work.)

    Well this is exactly the problem: I instantiate an object and then create a Weak­Reference to it (global variable).

    Then at some point the object is released (set to null, disposed, erased from the face of the earth, you name it) yet if I check the Is­Alive property it still returns true.

    Only if I explicitly call to GC.Collect(0) or greater before the check it is disposed.

    The customer still hasn't let go of the concept of reference counting, since he says that the object is "released". In a garbage-collected system, object are not released; rather, you simply stop referencing them. And disposing of an object still maintains a reference; disposing just invokes the IDisposable.Dispose method.

    FileStream fs = new FileStream(fileName);
    using (fs) {
     ...
    }
    

    At the end of this code fragment, the File­Stream has been disposed, but there is still a reference to it in the fs variable. Mind you, that reference isn't very useful, since there isn't much you can do with a disposed object, Even if you rewrite the fragment as

    using (FileStream fs = new FileStream(fileName)) {
     ...
    }
    

    the variable fs still exists after the close-brace; it simply has gone out of scope (i.e., you can't access it any more). Scope is not the same as lifetime. Of course, the optimizer can step in and make the object eligible for collection once the value becomes inaccessible, but there is no requirement that this optimization be done.

    The fact that the Is­Alive property says true even after all known references have been destroyed is also no surprise. The environment does not check whether an object's last reference has been made inaccessible every time a reference changes. One of the major performance benefits of garbage collected systems comes from the de-amortization of object lifetime determination. Instead of maintaining lifetime information about an object continuously (spending a penny each time a reference is created or destroyed), it saves up those pennies and splurges on a few dollars every so often. The calculated risk (which usually pays off) is that the rate of penny-saving makes up for the occasional splurges.

    It does mean that between the splurges, the garbage collector does not know whether an object has outstanding references or not. It doesn't find out until it does a collection.

    The null garbage collector takes this approach to an extreme by simply hoarding pennies and never spending them. It saves a lot of money but consumes a lot of memory. The other extreme (common in unmanaged environments) is to spend the pennies as soon as possible. It spends a lot of money but reduces memory usage to the absolute minimum. The designers of a garbage collector work to find the right balance between these two extremes, saving money overall while still keeping memory usage at a reasonable level.

    The customer appears to have misinterpreted what the Is­Alive property means. The property doesn't say whether there are any references to the object. It says whether the object has been garbage collected. Since the garbage collector can run at any time, there is nothing meaningful you can conclude if Is­Alive returns true, since it can transition from alive to dead while you're talking about it. On the other hand, once it's dead, it stays dead; it is valid to take action when Is­Alive is false. (Note that there are two types of Weak­Reference; the difference is when they issue the death certificate.)

    The name Is­Alive for the property could be viewed as misleading if you just look at the property name without reading the accompanying documentation. Perhaps a more accurate (but much clumsier) name would have been Has­Not­Been­Collected. The theory is, presumably, that if you're using an advanced class like Weak­Reference, which works "at the GC level", you need to understand the GC.

    The behavior the customer is seeing is correct. The odds that the garbage collector has run between annihilating the last live reference and checking the Is­Alive property is pretty low, so when you ask whether the object has been collected, the answer will be No. Of course, forcing a collection will cause the garbage collector to run, and that's what does the collection and sets Is­Alive to false. Mind you, forcing the collection to take place messes up the careful penny-pinching the garbage collector has been performing. You forced it to pay for a collection before it had finished saving up for it, putting the garbage collector in debt. (Is there a garbage collector debt collector?) And the effect of a garbage collector going into debt is that your program runs slower than it would have if you had let the collector spend its money on its own terms.

    Note also that forcing a generation-zero collection does not guarantee that the object in question will be collected: It may have been promoted into a higher generation. (Generational garbage collection takes advantage of typical real-world object lifetime profiles by spending only fifty cents on a partial collection rather than a whole dollar on a full collection. As a rough guide, the cost of a collection is proportional to the number of live object scanned, so the most efficient collections are those which find mostly dead objects.) Forcing an early generation-zero collection messes up the careful balance between cheap-but-partial collections and expensive-and-thorough collections, causing objects to get promoted into higher generations before they really deserve it.

    Okay, that was a long discussion of a short email thread. Maybe tomorrow I'll do a better job of keeping things short.

    Bonus chatter: In addition to the Weak­Reference class, there is also the GC­Handle structure.

    Bonus reading: Maoni's WebLog goes into lots of detail on the internals of the CLR garbage collector. Doug Stewart created this handy index.

  • The Old New Thing

    Everybody thinks about CLR objects the wrong way (well not everybody)

    • 34 Comments

    Many people responded to Everybody thinks about garbage collection the wrong way by proposing variations on auto-disposal based on scope:

    What these people fail to recognize is that they are dealing with object references, not objects. (I'm restricting the discussion to reference types, naturally.) In C++, you can put an object in a local variable. In the CLR, you can only put an object reference in a local variable.

    For those who think in terms of C++, imagine if it were impossible to declare instances of C++ classes as local variables on the stack. Instead, you had to declare a local variable that was a pointer to your C++ class, and put the object in the pointer.

    C#C++
    void Function(OtherClass o)
    {
     // No longer possible to declare objects
     // with automatic storage duration
     Color c(0,0,0);
     Brush b(c);
     o.SetBackground(b);
    }
    void Function(OtherClass o)
    {
     Color c = new Color(0,0,0);
     Brush b = new Brush(c);
     o.SetBackground(b);
    }
    void Function(OtherClass* o)
    {
     Color* c = new Color(0,0,0);
     Brush* b = new Brush(c);
     o->SetBackground(b);
    }

    This world where you can only use pointers to refer to objects is the world of the CLR.

    In the CLR, objects never go out of scope because objects don't have scope.¹ Object references have scope. Objects are alive from the point of construction to the point that the last reference goes out of scope or is otherwise destroyed.

    If objects were auto-disposed when references went out of scope, you'd have all sorts of problems. I will use C++ notation instead of CLR notation to emphasize that we are working with references, not objects. (I can't use actual C++ references since you cannot change the referent of a C++ reference, something that is permitted by the CLR.)

    C#C++
    void Function(OtherClass o)
    {
     Color c = new Color(0,0,0);
     Brush b = new Brush(c);
     Brush b2 = b;
     o.SetBackground(b2);
    
    
    
    
    
    }
    void Function(OtherClass* o)
    {
     Color* c = new Color(0,0,0);
     Brush* b = new Brush(c);
     Brush* b2 = b;
     o->SetBackground(b2);
     // automatic disposal when variables go out of scope
     dispose b2;
     dispose b;
     dispose c;
     dispose o;
    }

    Oops, we just double-disposed the Brush object and probably prematurely disposed the OtherClass object. Fortunately, disposal is idempotent, so the double-disposal is harmless (assuming you actually meant disposal and not destruction). The introduction of b2 was artificial in this example, but you can imagine b2 being, say, the leftover value in a variable at the end of a loop, in which case we just accidentally disposed the last object in an array.

    Let's say there's some attribute you can put on a local variable or parameter to say that you don't want it auto-disposed on scope exit.

    C#C++
    void Function([NoAutoDispose] OtherClass o)
    {
     Color c = new Color(0,0,0);
     Brush b = new Brush(c);
     [NoAutoDispose] Brush b2 = b;
     o.SetBackground(b2);
    
    
    }
    void Function([NoAutoDispose] OtherClass* o)
    {
     Color* c = new Color(0,0,0);
     Brush* b = new Brush(c);
     [NoAutoDispose] Brush* b2 = b;
     o->SetBackground(b2);
     // automatic disposal when variables go out of scope
     dispose b;
     dispose c;
    }

    Okay, that looks good. We disposed the Brush object exactly once and didn't prematurely dispose the OtherClass object that we received as a parameter. (Maybe we could make [NoAutoDispose] the default for parameters to save people a lot of typing.) We're good, right?

    Let's do some trivial code cleanup, like inlining the Color parameter.

    C#C++
    void Function([NoAutoDispose] OtherClass o)
    {
     Brush b = new Brush(new Color(0,0,0));
     [NoAutoDispose] Brush b2 = b;
     o.SetBackground(b2);
    
    
    }
    void Function([NoAutoDispose] OtherClass* o)
    {
     Brush* b = new Brush(new Color(0,0,0));
     [NoAutoDispose] Brush* b2 = b;
     o->SetBackground(b2);
     // automatic disposal when variables go out of scope
     dispose b;
    }

    Whoa, we just introduced a semantic change by what seemed like a harmless transformation: The Color object is no longer auto-disposed. This is even more insidious than the scope of a variable affecting its treatment by anonymous closures, for introduction of temporary variables to break up a complex expression (or removal of one-time temporary variables) are common transformations that people expect to be harmless, especially since many language transformations are expressed in terms of temporary variables. Now you have to remember to tag all of your temporary variables with [NoAutoDospose].

    Wait, we're not done yet. What does SetBackground do?

    C#C++
    void OtherClass.SetBackground([NoAutoDispose] Brush b)
    {
     this.background = b;
    }
    void OtherClass::SetBackground([NoAutoDispose] Brush* b)
    {
     this->background = b;
    }

    Oops, there is still a reference to that Brush in the o.background member. We disposed an object while there were still outstanding references to it. Now when the OtherClass object tries to use the reference, it will find itself operating on a disposed object.

    Working backward, this means that we should have put a [NoAutoDispose] attribute on the b variable. At this point, it's six of one, a half dozen of the other. Either you put using around all the things that you want auto-disposed or you put [NoAutoDispose] on all the things that you don't.²

    The C++ solution to this problem is to use something like shared_ptr and reference-counted objects, with the assistance of weak_ptr to avoid reference cycles, and being very selective about which objects are allocated with automatic storage duration. Sure, you could try to bring this model of programming to the CLR, but now you're just trying to pick all the cheese off your cheeseburger and intentionally going against the automatic memory management design principles of the CLR.

    I was sort of assuming that since you're here for CLR Week, you're one of those people who actively chose to use the CLR and want to use it in the manner in which it was intended, rather than somebody who wants it to work like C++. If you want C++, you know where to find it.

    Footnote

    ¹ Or at least don't have scope in the sense we're discussing here.

    ² As for an attribute for specific classes to have auto-dispose behavior, that works only if all references to auto-dispose objects are in the context of a create/dispose pattern. References to auto-dispose objects outside of the create/dispose pattern would need to be tagged with the [NoAutoDispose] attribute.

    [AutoDispose] class Stream { ... };
    
    Stream MyClass.GetSaveStream()
    {
     [NoAutoDispose] Stream stm;
     if (saveToFile) {
      stm = ...;
     } else {
      stm = ...;
     }
     return stm;
    }
    
    void MyClass Save()
    {
     // NB! do not combine into one line
     Stream stm = GetSaveStream();
     SaveToStream(stm);
    }
    
Page 120 of 419 (4,184 items) «118119120121122»