When Are You Required To Set Objects To Nothing?

When Are You Required To Set Objects To Nothing?

Rate This
  • Comments 52

A quick follow up on my earlier entry on the semantics of Nothing in VBScript. I see code like this all the time:

Function FrobTheBlob()
  Dim Frobber, Blob
  Set Frobber = CreateObject("BitBucket.Frobnicator")
  Set Blob = CreateObject("BitBucket.Blobnicator")
  FrobTheBlob = Frobber.Frob(Blob)
  Set Frobber = Nothing
  Set Blob = Nothing
End Function

What's the deal with those last two assignments? Based on the number of times I've seen code that looks just like this, lots of people out there are labouring under the incorrect belief that you have to set objects to Nothing when you're done with them.

First off, let me be very clear: I'm going to criticize this programming practice, but that does NOT mean that you should change existing, working code that uses it! If it ain't broke, don't fix it.

The script engine will automatically clear those variables when they go out of scope, so clearing them the statement before they go out of scope seems to be pointless. It's not bad -- clearly the code works, which is the important thing -- but it needlessly clutters up the program with meaning-free statements. As I've ranted before, in a good program every statement has a meaning.

When I see code like this, the first thing I think is cargo cult programmer. Someone was told that the magic invocation that keeps the alligators away is to put a banana in your ear and then set objects to Nothing when you're done with them. They do, and hey, it works! No alligators!

Where the heck did this thing come from?; I mean, you don't see people running around setting strings to "" or integers back to zero. You never see it in JScript. You only ever see this pattern with objects in VB and VBScript.

A few possible explanations immediately come to mind.

Explanation #1: (Bogus) Perhaps some earlier version of VB required this. People would get into the habit out of necessity, and when it became no longer necessary, it's hard to break the habit. Many developers learn by reading old code, so those people would pick up on the old practice.

This explanation is bogus. To my knowledge there has never been any version of VB that required the user to explicitly deallocate all objects right before the variables holding them went out of scope. I'm aware that there are plenty of people on the Internet who will tell you that the reason they set their objects to Nothing is because the VB6 garbage collector is broken. I do not believe them. If you've got a repro that shows that the GC is broken, I'd love to see it.

Explanation #2: (Bogus) Circular references are not cleaned up by the VB6 garbage collector. You've got to write code to clean them up, and typically that is done by setting properties to Nothing before the objects go out of scope.

Suppose you find yourself in this unfortunate situation:

Sub BlorbTheGlorb()
  Dim Blorb, Glorb
  Set Blorb = CreateObject("BitBucket.Blorb")
  Set Glorb = CreateObject("BitBucket.Glorb")
  Set Blorb.Glorber = Glorb
  Set Glorb.Blorber = Blorb
  '
  ' Do more stuff here
  '

and now when the procedure finishes up, those object references are going to leak because they are circular.

But you can't break the ref by cleaning up the variables, you have to clean up the properties. You have to say

Set Blorb.Glorber = Nothing
Set Glorb.Blorber = Nothing

and not

Set Blorb = Nothing
Set Glorb = Nothing

Perhaps the myth started when someone misunderstood "you have to set the properties to Nothing" and took it to mean "variables" instead. Then, as in my first explanation, the misinformation spread through copying code without fully understanding it.

I have a hard time believing this explanation either. Because they are error-prone, most people avoid circular references altogether. Could it really be that enough people ran into circular ref problems and they all solved the problem incorrectly to cause a critical mass? As Tommy says on Car Talk: Booooooooooooogus!

Explanation #3: It's a good idea to throw away expensive resources early. Perhaps people overgeneralized this rule? Consider this routine:

Sub FrobTheFile()
  Dim Frobber
  Set Frobber = CreateObject("BitBucket.Frobber")
  Frobber.File = "c:\blah.database" ' locks the file for exclusive r/w access
  '
  ' Do stuff here
  '
  Set Frobber = Nothing ' final release on Frobber unlocks the file
  '
  ' Do more stuff here
  '
End Sub

Here we've got a lock on a resource that someone else might want to acquire, so it's polite to throw it away as soon as you're done with it. In this case it makes sense to explicitly clear the variable in order to release the object early, as we're not going to get to the end of its scope for a while. This is a particularly good idea when you're talking about global variables, which are not cleaned up until the program ends.

Another -- perhaps better -- design would be to also have a "close" method on the object that throws away resources if you need to do so explicitly.  This also has the nice result that the close method can take down circular references.

I can see how overapplication of this good design principle would lead to this programming practice. It's easier to remember “always set every object to Nothing when you are done with it“ than “always set expensive objects to Nothing when you are done with them if you are done with them well before they go out of scope“. The first is a hard-and-fast rule, the second has two judgment calls in it.

I'm still not convinced that this is the whole story though.

Explanation #4: I originally thought when I started writing this entry that there was no difference between clearing variables yourself before they go out of scope, and letting the scope finalizer do it for you.  There is a difference though, that I hadn't considered. Consider our example before:

Sub BlorbTheGlorb()
  Dim Blorb, Glorb
  Set Blorb = CreateObject("BitBucket.Blorb")
  Set Glorb = CreateObject("BitBucket.Glorb")

When the sub ends, are these the same?

  Set Blorb = Nothing
  Set Glorb = Nothing
End Sub

versus

  Set Glorb = Nothing
  Set Blorb = Nothing
End Sub

The garbage collector is going to pick one of them, and which one, we don't know. If these two objects have some complex interaction, and furthermore, one of the objects has a bug whereby it must be shut down before the other, then the scope finalizer might pick the wrong one! 

(ASIDE: In C++, the order in which locals are destroyed is well defined, but it is still possible to make serious mistakes, particularly with the bane of my existence, smart pointers. See Raymond's blog for an example.)

The only way to work around the bug is to explicitly clean up the objects in the right order before they go out of scope.

And indeed, there were widely-used ADO objects that had this kind of bug.   Mystery solved.

I'm pretty much convinced that this is the origin of this programming practice.  Between ADO objects holding onto expensive recordsets (and therefore encouraging early clears), plus shutdown sequence bugs, lots of ADO code with this pattern got written.  Once enough code with a particular pattern gets written, it passes into folklore that this is what you're always supposed to do, even in situations that have absolutely nothing to do with the original bug.

I see this all over the place. Here's some sample documentation that I copied off the internet:

You can save an instance of a persistent object using its sys_Save method. Note that you must call sys_Close on an object when you are through using it. This closes the object on the server. In addition you should set patient to Nothing to close the object in Visual Basic.

Dim status As String
patient.sys_Save
patient.sys_Close
Set patient = Nothing

Notice that calling the close method is a "must" but setting the variable to Nothing is a "should". Set your locals to Nothing : it's a moral imperative! If you don't, the terrorists have already won. (One also wonders what the string declaration there is for. It gets worse -- I've omitted the part of the documentation where they incorrectly state what the rules are for using parentheses. The page I got this from is a mass of mis-statements -- calling all of them out would take us very far off topic indeed.)

I would imagine that there are lots of these in the MSDN documentation as well.

What is truly strange to me though is how tenacious this coding practice is. OK, so some objects are buggy, and sometimes you can work around a bug by writing some code which would otherwise be unnecessary. Is the logical conclusion “always write the unnecessary code, just in case some bug happens in the future?”  Some people call this “defensive coding”.  I call it “massive overgeneralization“. 

True story: I found a performance bug in the Whidbey CLR jitter the other day. There's a bizarre situation in which a particular mathematical calculation interacts with a bug in the jitter that causes the jitter to run really slowly on a particular method. It's screwing up our performance numbers quite badly.  If I change one of the constants in the calculation to a variable, the problem goes away, because we no longer hit the buggy code path in the jitter.

They'll fix the bug before we ship, but consider a hypothetical. Suppose we hadn't found the bug until after we'd shipped Whidbey. Suppose I needed to change my code so that in runs faster in the buggy Whidbey CLR. What's the right thing to do?

Solution One:  Change the constant to a variable in the affected method.  Put a comment as long as your arm in the code explaining to future maintenance programmers what the bug is, what versions of the framework causes the problem, how the workaround works, who implemented the workaround, and how to do regression testing should the underlying bug be fixed in future versions of the framework.  Realize that there might be similar problems elsewhere, and be on the lookout for performance anomalies.

Solution Two: Change all constants to variables. And from now on, program defensively; never use constants again -- because there might someday be a future version of the framework that has a similar bug. Certainly don't put any comments in the code. Make sure that no maintenance programmers can possibly tell the necessary, by-design uses of variables from the unnecessary, pointless uses. Don't look for more problems; assume that your heuristic solution of never using constants again is sufficient to prevent not only this bug, but future bugs that don't even exist yet. Tell other people that “constants are slower than variables“, without any context.  And if anyone questions why that is, tell them that you've been programming longer than they have, so you know best.  Maybe throw in a little “Microsoft suxors, open source rulez!” rhetoric while you're at it -- that stuff never gets old.

Perhaps I digress. I'd like to take this opportunity to recommend the first solution over the second.

This is analogous to what I was talking about the other day in my posting on Comment Rot. If you hide the important comments amongst hundreds of trivial comments, the program gets harder to understand.  Same thing here -- sometimes, it is necessary to write bizarre, seemingly pointless code in order to work around a bug in another object.  That's a clear case of the purpose of the code being impossible to deduce from the syntax, so call it out!  Don't hide it amongst a thousand instances of identical really, truly pointless code.

  • I would wager that this is due to one or both of the following: 1) an empirically derived rule, based on buggy automation implementations from "the good ol' days"; 2) well-meaning but wrong advice based on misunderstanding of scope, etc.

    Someone previously commented that Excel sometimes would get stuck in memory when invoked from Jscript. It wasn't the only app. In the mid-to-late 1990s timeframe, lots of devs were learning to write automation objects, and making the requisite mistakes along the way. Perhaps stuck apps were due to actual bugs in their implementation. Or, perhaps the bugs were due to references getting parked in global variables or in some intermediate scope. Setting "o = Nothing" is black magic that works, so why not do it everywhere? Regarding my second point: I'm a C++ developer, so the idea of scope, ctors & dtors, etc makes sense. But, for folks who started out with QBasic or Pascal and moved to VB, perhaps the idea of destruction was new. Granted, these languages did have scope, but especially with Basic, the paradigm (when I learned it, at least) was "all globals, baby!"
  • with apologies to Wm. Shakespeare...

    If setting objects to Nothing not be mandatory, it nonetheless remains useful from a _documentation_ standpoint, since once an object is set to Nothing then the script should no longer make use of that object.

    Those obsessed with the "cargo cult" metaphorical aspect of this practice may merely treat it as documentation, while those who continue to implement the practice for other reasons can do so freely.

    And then we can all live happily ever after!8-))
  • Indeed, self-documenting code is goodness!

    But why stop at objects? Why not set strings and arrays and numbers back to their initial values to indicate that you're done using them?

  • I think that I located the source of this practice among VBScripters ... in the Holy Scripting Bible (otherwise known as SCRIPT56.CHM), there is a very oft used Method called "Run", and therein is an example at the end of said holy verse ...

    =====

    Example 2
    The following VBScript code opens a command window, changes to the path to C:\ , and executes the DIR command.

    Dim oShell
    Set oShell = WScript.CreateObject ("WSCript.shell")
    oShell.run "cmd /K CD C:\ & Dir"
    Set oShell = Nothing

    =====

    Oops !!, it seems that this practice is sanctioned by the Scripting Bible itself !?!? It spread viral-like from the holy book itself ... ;o)

    (i've often thought that if the SCRIPT56.CHM was updated - surely a *long* overdue task, the inconsistencies are confusing and **many** - then a lot of things could be cleared up just with a documentation revision !?) ;o)
  • I know it is a bit late to comment on this blog, but I have just experienced one of those cases in VB6, where setting an object reference to Nothing made a difference.

    In this case, I have a UserControl which implements an custom interface using the Implements keyword. A form using this control stores an interface pointer (of this custom interface type) to the control.

    Initially, the VB6 application crashed on exiting. After setting the interface pointer to Nothing, the program no longer crashed on exiting.

    In face, I have often seen this kind of problem, if you store an interface to a control on a VB6 form.

    For example, if a form passes an interface to one of its controls to another form, then that other form had better free the interface. Otherwise the program will probably crash on exiting. (To be fair, this is a lousy practice anyway, but quite legal VB6).

    I wish these cases did not exist, but it does seem naive to believe that they don't.

    Phil
  • Another use of this construct while debugging (similar to specifying the order of destruction as referred to above) is that it defines where the Terminate event occurs, allowing a more defined sequence of execution.

    Of course, production destruction code shouldn't have these sort of dependencies...

  • Performance Question:

    We make a large collection of many objects. It seems to take longer (several minutes) to clean up the objects than to create them (under a minute). How can we clean them up faster? We just want them to all go away.

    We've tried similar tasks in C++ and Java and both are sub-second to create and delete millions of similar objects. VB takes about a minute to create and several minutes to delete.

    We've tried letting VB clean up automatically, and we've also tried to explcitly set the objects and the collection to nothing. Nothing seems to get it going any faster.
  • Clearly something weird is going on. But there's nowhere near enough information for me to diagnose the problem.

    I would try running the slow code through a profiler and see what it comes up with. If that doesn't help, the best I can suggest is a support call -- make up a small, solid repro that demonstrates the problem, and call it in.
  • we've run side by side tests of simple samples. Creating 1 million instances of simple objects (3 members, 1 Str, 1 Double, 1 Int) and addint them to basic collections. We've then tried deleting the collection and deleting the objects individually.

    We've done this in VB6, C++, and Java. Java and C++ complete the whole thing sub second. VB app takes almost a minute to create and several minutes to clean up. VB app freezes during the clean up but eventually completes.

    The item above by Peter Ibbotson at 4/28/2004 12:05 PM sounds exactly like what we are experiencing. We just accepted that VB was this slow but are hoping that there is some way to delete these objects more efficiently.

  • Same collection object in each case, or is it a different object in each?

    My guess -- which, I will point out again, is a guess in total absence of actually seeing the scenario -- is that some collection objects are optimized for large sets and some are not, and perhaps that is the nature of the problem?

  • We used the class wizard to create Class1 with 3 properties: A string, a Double, and a Long.

    Then used the class wizard to create a Collection of Class1 called Coll1.

    Then used the code below to create instances of Class1 and add them to an instance of Coll1.

    Up to 12,000 instances it is sub second and it is faster to delete than create (about .2 sec each).

    Above 12,000 instances it is slower to delete than create.

    At 100,000 instances it is 2 sec to create, 17 sec to delete.

    At 160,000 instances it is 3 sec to create, 45 sec to delete.

    Is there a better way?

    Option Explicit

    Private Sub cmdGo_Click()

    Dim n As Long
    Dim MyColl As Coll1

    Dim tStart As Double
    Dim tEnd As Double
    Dim tElapsedCreate As Double
    Dim tElapsedDelete As Double

    Set MyColl = New Coll1

    ' Time Creation of the objects
    tStart = Timer
    For n = 1 To CLng(Me.txtN)
    Call MyColl.Add(n, CStr(n), CDbl(n))
    Next n
    tEnd = Timer

    tElapsedCreate = tEnd - tStart
    'Me.txtCreateTime = CStr(tElapsedCreate)

    ' Time Deletion of the objects
    tStart = Timer

    Set MyColl = Nothing

    tEnd = Timer

    tElapsedDelete = tEnd - tStart
    'Me.txtDeleteTime = CStr(tElapsedDelete)

    MsgBox "Create: " & CStr(tElapsedCreate) & vbCrLf & "Delete: " & CStr(tElapsedDelete)

    End Sub

  • I believe you that its slow. I don't know whether the collection class was even designed to handle collections of that size.

    My point though is that comparing this code against the "equivalent" code in C++ is not actually a valid comparison unless the C++ code uses exactly the same collection object. If the C++ code uses the same collection object AND it is still faster, then odds are very good that there's something wrong in the language engine. If it is just as slow, odds are very good that the problem is in the collection class.

    What collection objects do you use in the C++ and Java benchmarks?
  • In C++ and Java the tests were done with regular lists, not the same Collection class used in VB.

    But we've also tried a test in VB where we use a linked list rather than a collection. It is faster, but deleting the objects still takes much longer than it should.
  • Then you're not comparing apples to apples, are you? If you want to find out what's slow, you don't change _everything_. You change _one thing at a time_.

    Odds are very good that it's the collection object that's slow.

    In any event, is this a meaningful benchmark? Are you actually writing a program that must handle millions of data which are allocated and deallocated frequently? In that case I would recommend that you either (a) use a language which allows you fine-grained control over memory lifetime -- like, say, C++, or (b) use a language which runs in an environment with a specially tuned garbage collector, like C# or VB.NET.
  • I wish we had done it in C++.

    Our basic need is to create a list of structs to hold some data properties. We create them all at once, we use them, then we delete them all at once.

    Creating and using them is pretty fast. It's just deleting them that takes long. Which makes me wonder if we are doing something wrong. But it sounds like we did things the accepted way and this is just the built-in penalty for using VB.

    I was hoping there was some trick to speeding it up like there is for strings. We also need to concat large strings from many small parts. We started by using the STR1 = STR1 & STR2, which was slow. We replaced it with a string buffer and it sped up literally tens of thousands of times faster for very large strings.

    But it sounds like no such workaround is available for objects.

Page 3 of 4 (52 items) 1234