I've learned a lot in working on my first real WPF application such as implimenting multi-threaded file reading, how to use the dispatcher object, how to use control templates to customize controls, the basics of application configuration, using abstract C# classes, and using anonymous methods.

I've posted the source code that goes along with this post.  It is in the file CLCV-Blog-3.ZIP at the end of this post.  This version of CLCV has some new features

  • It has an option dialog box for setting the user's preferences.  This mechanism uses the application properties wizard in Visual Studio - this was surprisingly easy to do, but it does have one problem; I can't control where the data is saved! (yet).
  • File I/O is multi-threaded.  One thread reads the lines from the CSV input file and sends the resulting data to the UI thread using the UI's Dispatcher.
  • File reading can be quickly canceled - an important attribute for a program that is designed to load large files.
  • CLCV displays an accurate progress bar as it loads the file.  
  • The recently used file list now works correctly
    • It discards files that no longer exist when CLCV starts
    • It keeps the list sorted in order of use (most recent at the top).

There are many new concepts for a native C++ developer to learn when ramping up on WPF and C#.  One of the most important is how to customize the look and feel of your applications - this is a Raison D’être for WPF.  In my earlier versions of CLCV, I couldn't figure out how to customize the look and feel of buttons so I rolled my own using a rectangle and some animations.  It looked ok, but this was completely the wrong way to go about it.

One of my friends (Joe Laughlin) pointed me in the right direction: Customizing controls is easily done using Control Templates which are designed to do exactly what I needed: completely controling the look and feel of any WPF control while maintaining their semantics.  Even better, with WPF resources, this can be asily done for an entire application.  Control templates made it stright foreward for me to set the look and feel of buttons in my application to a blue colored theme.  So far, I've just scratched the surface of control templates, but you can see what I did by looking in resources\button.xaml.  It is tied into the application in App.xaml like this:

    <Application.Resources>
        <ResourceDictionary>
            <ResourceDictionary.MergedDictionaries>
                <ResourceDictionary Source="Resources\Button.xaml" />
            </ResourceDictionary.MergedDictionaries>
        </ResourceDictionary>
    </Application.Resources>

The XAML above applies my customized Button control template to every button in the application without touching the XAML for the other windows or pages. This is very cool. More than cool, its a great example of how WPF separates design (look and feel) from an application's semantics.

An important feature of CLCV V3 is its options dialog box.  This makes it easy to tweak and tune key parameters without recompiling.  Here is some info on the options:

  • Show Console: when checked, CLCV will display its debug/diagnostic console.
  • Synchronous Line Reads: this forces the file reader thread to post messages to the UI thread synchronously using Invoke() instead of BeginInvoke(). ( using Invoke()this is slow)
  • Animate Directory Loading: when checked, this forces CLCV to visibly populate the tree view control as items are read (this is slow).
  • Maximum Outstanding Messages: this is the maximum number of outstanding messages the file reader thread will have with the UI thread.
  • Message priority: this is the DispatcherPriority  used by the reader thread send messages to the UI thread (in the calls to Invoke() or BeginInvoke().

While V3 of CLCV is still naively implemented in many regards (this is my first WPF app and I only started with C# and WPF in late December '06).  But, I knew from the get go that I'd have to handle file reading in a thread separate from the UI thread to avoid UI hangs and sluggishness during file loading.

Of course, it is possible for single threaded applications to efficiently read files and keep their UI responsive - I have a native C++ class that provides I/O support for this.  My native class is extremely efficient and can easily drive the disk at its maximum sequential read rate with one thread. But, this is more complex in .NET 3.0.  While .NET does provide the fundamental support for asynchronous I/O, I'd essentially have to re-implement my Native C++ classes in C# and I'm not ready for that yet. 

This approach would also be just as messy in WPF as it would be in native code: mixing I/O and procssing UI events in one thread requires a state machine approach to handle issuing asynchronous reads, processing UI messages, and handling completed read events.  This is prone to complexity and can be difficult to debug and maintain.

Fortunately, it is very easy to create a file reading thread in .NET and for that thread to send its data to the UI thread asynchronously.  It takes surprisingly little code to do this:

   1:          public FileLoader( DataViewWindowClass dvw )
   2:          {
   3:              MyDataViewWindow = dvw;
   4:   
   5:              UseSyncronousLineReads      = TheApp.UserProperties.SyncronousLineReadsFlag;
   6:   
   7:              StartFileLoadHandler       += MyDataViewWindow.StartFileLoad;
   8:              NewFileHandler             += MyDataViewWindow.AddNewFileHandler;
   9:              NewDirHandler              += MyDataViewWindow.AddNewDirHandler;
  10:              NewDirTreeHandler          += MyDataViewWindow.AddNewDirTreeHandler;
  11:              FileCompletedSignalHandler += MyDataViewWindow.FileCompletedSignal;
  12:   
  13:              FileLoaderThreadEntryPoint  = new ThreadStart( LoaderThread );
  14:              MyThread                    = new Thread( FileLoaderThreadEntryPoint );
  15:              WaterMarkSemaphore          = new Semaphore( TheApp.UserProperties.MaxOutstandingMessageCount, 
  16:                                                           TheApp.UserProperties.MaxOutstandingMessageCount );
  17:          }

The code above is from DataView.Xaml.cs.  It creates the FileLoader object which owns the file reading thread.  The reading thread uses five delegates to communicate with the UI thread:

  1. One to signal that the file loading operation has actually started.  This message is used to send the file size to the UI thread so it can setup the progress bar.
  2. One message each for for files, directories, and directory trees.   Each of these messages also includes the number of bytes read from the file so far.  This allows the UI thread to keep the progress bar up to date.
  3. And finally, one to signal that the file read is completed.

The reading thread does all the work to read the file, which is in comma separated value (CSV) format.  The CSV file contains all the data necessary for CLCV to reconstruct the directory tree scanned by CLC.  (note, in the TestData directroy from ZIP file, I've included three CSV files, one small, one medium sized, and one large - this is actual data from some of my source code trees).

The reader thread handles the following work

  • file reads using a StreamReader object
  • detecting header lines and blank lines
  • splitting each line into the comma separated fields
  • parsing and converting the text data to binary data and putting that data in objects
  • sending those objects to the UI thread

All in all, this works quite well for a first attempt: the UI stays alive (doesn't hang), the file read operation can be quickly canceled, and its all done with straight forward code.

Note that getting the dispatch priority correct is very important.  On a single CPU system (like my laptop), using too low a priority simply causes the entire process to drag out.  Using too high a priority causes the I/O thread to starve the UI thread.  For example, if you set the dispatch priority to "input", then the UI thread may need to work through large numbers of input messages before it processes input events, such as a cancel request.   Setting the dispatch priroity above "input" (to "loaded" or "render") will keep input events from being processed.

Going higher, to "render" will interfere with the actual rendering of the UI causing the progress bar to be jerky.  Going higher than "render"  to "databind", "normal", or "send" completly stops UI rendering and blocks all input thus hanging the UI; this is specific problem that multi-threaded I/O is intended to handle. 

The "Background" dispatch priority seems to work acceptably well.  Using this priority, the UI remains responsive while consuming input events fromt the reader thread relativly smoothly.

Note that using dispatcher priorities is orthogonal to setting thread priorities - the dispatcher priority is simply the priority at which the dispatcher removes input events (delegates) from its input queues.

But, there are some performance issues in this initial naive implementation:

  1. The progress bar doesn't terminate nicely - it gets updated to 100% complete in the UI file completion routine (see the function FileCompletedSignal() ), but WPF spends a lot of time in this function before it re-renders the UI.  I need to figure this out.
  2. The biggest problem is the time it takes to actually build set of tree view items that are used to populate the list of directories and files.   This takes 10 to 20 times more than the actual file I/O, which is very surprising given that in native code, I have tools that do significantly more text processing while still remaining I/O bound - all in one thread.
  3. For small trees (just a few hundred items) this is fast enough not to be annoying (try loading the small and medium size test files).  However, for the 83,000+ elements in the largest file, processing all the data into TreeView items is excruciatingly slow.

  4. The performance of the tree view control itself also bogs down with the largest file.   Its noticeably sluggish when the selection changes.

That being said, my first implimention is certainly naive - I simply build a set of tree view items that mirror the entire file and directory structure.  It doesn't look like too much trouble be a lot smarter about this - populating the tree view as needed from another data structure.  I'm going to try this next.

In upcoming posts, I'll explore better ways to handle the TreeView control (only populating it as necessary), do some profiling to see if I can speed up data processing, and add some more advanced control templates, and explain why its important to throttle the number of messages from the file reading thread (a producer) to the UI thread (the consumer).