StreamWriter Buffered Data Lost MDA (or a cute finalizer trick) [Brian Grunkemeyer]

StreamWriter Buffered Data Lost MDA (or a cute finalizer trick) [Brian Grunkemeyer]

Rate This
  • Comments 9

A somewhat common problem when getting started with developing managed code and using our IO package is forgetting to close a Stream or a StreamWriter.  Users who write code like this will be disappointed:

    void Foo() {
        StreamWriter sw = new StreamWriter("file.txt");
        sw.WriteLine("Data");
        // Forgot to close the StreamWriter.
    }

In this example, the data in the StreamWriter is never written to the underlying stream. 

Background

StreamWriter internally buffers data in an attempt at reducing the aggregate amount of work needed to write out data, and FileStream does the exact same thing.  So for correctness, the StreamWriter (and FileStream) must be closed explicitly by the user.  While we can rely on finalization to ensure that the FileStream's handle is eventually closed, and we can probably ensure that any buffered data in a FileStream has been flushed to disk (even with SafeHandle in Whidbey, using the very weak ordering we added to critical finalization explicitly to solve this problem), we cannot ensure that the StreamWriter's buffer is written to disk.  The reason is that (normal) finalizers aren't ordered - any two objects may be finalized in any order, or at the same time if we add multiple finalizer threads in a future version. 

The explanation of this problem has made it into a few different forums, including Jeffrey Richter's Applied Microsoft .NET Framework Programming, on pages 484-485.  (Jeff chose to use BinaryWriter here instead of StreamWriter, but he's discussing the same issue.  However, it's not relevant with BinaryWriter because it doesn't have an internal buffer in our current implementation.  I'll ask Jeff to fix that for his next edition.)

Detecting Data Loss & Notifying the Developer

In any event, users who make the above mistake don't get any data written to their file, and they don't get any indication that they lost data by not closing the StreamWriter.  I'm investigating a change to fix that for Whidbey Beta 2.  We can detect this by adding a finalizer to StreamWriter whose sole purpose is to check for buffered data, and if found, then report an error.  We've added something to the product called Managed Debugging Assistants (MDA's) in this version, and while they're not the easiest thing to turn on right now, they should be well-integrated with Visual Studio sometime before we release the product.

When enabled, this MDA will display some message roughly like this:

A StreamWriter wasn't closed and all buffered data within that StreamWriter wasn't flushed to the underlying stream. (This was detected when the StreamWriter was finalized with data in its buffer.) A portion of your data is lost. Consider one of calling Close(), Flush(), setting the StreamWriter's AutoFlush property to true, or allocating the StreamWriter with a "using"statement to ensure your StreamWriter is properly cleaned up. Stream type: System.IO.FileStream

File name: C:\Test\IO\StreamWriterBufferLostMDA\junk.tmp

Allocated from:

at System.IO.StreamWriter.Init(Stream stream, Encoding encoding, Int32 bufferSize)

at System.IO.StreamWriter..ctor(String path, Boolean append, Encoding encoding, Int32 bufferSize)

at System.IO.StreamWriter..ctor(String path)

at LosesData.Main()

Note the addition of a stack trace here, showing you where the StreamWriter is allocated.  In a large application, knowing where you allocated & leaked one of the several StreamWriters you use is very useful, so you can easily find which code needs to be fixed.  In the example below, this was allocated from LosesData::Main(), which was my simple test case to demonstrate this problem.

MDA's are interesting because they can be disabled & enabled via settings in a debugger, or in a config file.  The exact details how to enable this (via an entry in a file called foo.mda.config?) or when this will be enabled (ie, only when you have a managed debugger attached, or if any debugger is attached?) are still being decided, so this may not show up exactly like this in Beta 2 or our final Whidbey bits.  But hopefully this gives you an idea of some ways we're trying to help people become more productive by helping them find their problems more quickly, while not seriously penalizing working code.

How to Clean Up a StreamWriter

There are a few ways of fixing this problem in your code, whether you've relied on the MDA to track it down, or you've noticed that your file is missing up to 4K worth of data.

Use the using statement in C# & VB.  In managed C++, use a try/finally to call Dispose. 

    void Foo() {
        using(StreamWriter sw = new StreamWriter("file.txt")) {
            sw.WriteLine("Data");
        }
    }

Or you can use the long form, expanding out the using clause:
                
    void Foo() {
        StreamWriter sw;
        try {
            sw = new StreamWriter("file.txt"));
            sw.WriteLine("Data");
        }
        finally {
            if (sw != null)
               sw.Close();
        }
    }

If neither of these solutions can be used (say, if you have a StreamWriter stored in a static variable and thus you cannot easily run code at the end of its lifetime), then calling Flush on the StreamWriter after its last use or setting its AutoFlush property to true before its first use will be sufficient.  Here's an example:

    internal static class Foo {
        private static StreamWriter _log;

        static Foo() {  // Static class constructor
           StreamWriter sw = new StreamWriter("log.txt");
           sw.AutoFlush = true;
           // Now publish the StreamWriter for other threads.
           _log = sw;
        }
    }

Other finalization tricks

You can play other tricks with finalizers as well.  I briefly added code to Object's finalizer that flagged any objects using IDisposable that didn't get disposed (the StreamWriter case here is really a subset of a much broader problem - improper resource cleanup).  However, we didn't like adding the finalizer to Object because its existence could hurt performance in retail builds, and it might hinder a few future optimizations we wanted to make.  The error detection was also somewhat noisy - it found a lot of issues, but not all of them were really bugs that need to be fixed.  But perhaps there's something here of merit that's worth revisiting... 

In any case, you could do this in debug builds of your own by adding in your own base class or set of base classes for your own types.  While you could take the route of defining a MyProjectObject, that's probably not really a good idea.  Instead, look at any base classes you might own - they're probably natural places for this type of error tracking in debug builds.  If I couldn't change the Object class, Stream might be a good runner up, for example.  And the best part is you can do these changes to your code in Everett - you don't have to wait for us to design an MDA reporting infrastructure, then invent some useful individual MDA's to find interesting features like this.

One annoyance with the finalizer approach that I ran into was that if the app simply quit, the finalizer either wasn't running, or it took so long to run that the CLR gave up & exited (we were spinning up a lot of code during the finalizer in this trivial test case, and we want to shut down within ~2 seconds of returning from main), or MDA's are disabled during process shutdown.  I don't know which of these cases I was running into.  But it was easy to fix by adding a call to GC.Collect() then GC.WaitForPendingFinalizers() to the end of Main.

  • I was talking to someone not too long ago who was experiencing a similar problem with DataReaders.

    People weren't closing them, eventually causing him to run out of connections. Seems like this would fix that as well.

    Rad.
  • You also should mention that other nice effect appear if you construct StreamWriter on top of Stream.
    I.e.
    Stream stream = new FileStream("foo.txt");
    using (StreamWriter writer = new StreamWriter(stream))
    {writer.WriteLine("Line1");}

    using (StreamWriter writer = new StreamWriter(stream))
    {writer.WriteLine("Line2");}

    So writer closes stream immediately after writing "Line1" and then second writer just throws "already closed" exception.
    Looking through MSDN there is even sample with exactly this code that definitely can't run :)
  • The manual expansion of using had sw.Close() in the finally, but I'd imagine that it's sw.Dispose() instead (which will of course will eventually do the same thing).

    Not that it much matters, just pointing out in case anyone's confused by it. :)
  • James, yes, you're right in that Dispose and Close are equivalent. The problem is that you can't call Dispose on the StreamWriter without casting it to IDisposable. When we added in the IDisposable interface in V1, we felt Close made more sense on all IO-related classes, so in some cases (like TextWriter), we privately implemented the IDisposable interface. In retrospect this may have been a mistake - either both should have been public, or we should have simply renamed Close to Dispose.

    Alex, if you happen to see that buggy code in MSDN, feel free to click on the "Send feedback" link at the bottom of the page to let us know about it. If the docs & samples can be improved and you know exactly how or at least why, sending in feedback is a great idea. That feedback does go to the correct person in our user education group, and you'll usually see the improvement in the next one or two quarterly MSDN releases.
  • Can third party components implement managed debugging assistants?
  • Unfortunately, 3rd parties won't be able to add MDA's in Whidbey. Some people are looking at doing that work in our Orcas release, after Whidbey.
  • http://www.zen13120.zen.co.uk/Blog/2004/08/blog-link-of-week-33-as-in-week-ending.html
Page 1 of 1 (9 items)