Welcome to MSDN Blogs Sign in | Join | Help

Extending Data Contracts

So let's say you're working on a WCF app and following a contract-first driven development model and autogenerating your data contracts based on XSD schema definitions. Svcutil does a reasonable job of creating your data contract classes. Curiously, collection type classes are autogenerated without any partial statement in the class declaration (more on this later). Non-collection types are created with a partial class declaration. For example, here's a declaration of a data contract (it was generated using the .NET 3.0 framework): 

[System.CodeDom.Compiler.GeneratedCodeAttribute("System.Runtime.Serialization", "3.0.0.0")]
[System.Runtime.Serialization.DataContractAttribute(Namespace="http://www.mycompany.com/20080910/BusinessItemDataContract")]
[System.SerializableAttribute()]
public partial class BusinessItemDataContract : object, System.Runtime.Serialization.IExtensibleDataObject

This allows us to create extended classes using another file. When the BusinessItemDataContract is regenerated the partial class code in the separate file will remain. This can be used on the server side to adjust values before sending data over the wire. Or, additional values can be added to the data contract and serialized if a DataMember attribute is added. This could be useful when the contract (i.e. the XSD) is provided by another WCF service and you want to embellish it with additional business data before returning it to the client. For example:

public partial class BusinessItemDataContract
{

[DataMember]
public string ClientNumber
{
  get { return _clientNumber; }
  set { _clientNumber = value; }
}

The DataMember attribute is not required by default. Keep it that way. If the DataMember is marked as Required then call the service returning the original BusinessItemDataContract will fail as the return contract will not contain the new ClientNumber data member. It adheres to the original definition of the BusinessItemDataContract XSD which does not contain this new element.

Collection type data contracts are not so easily extended - and for good reason. An autogenerated collection type DataContract looks like:

[System.CodeDom.Compiler.GeneratedCodeAttribute("System.Runtime.Serialization", "3.0.0.0")][System.Runtime.Serialization.CollectionDataContractAttribute(Namespace="http://www.mycompany.com/20080910/BusinessItemCollection", ItemName="BusinessItemDataContract")]
[System.SerializableAttribute()]
public class BusinessItemCollection : System.Collections.Generic.List<BusinessItemContract>

Autogenerated code for a collection type contract uses a CollectionDataContract attribute to drive the serialization process of the contained items, in this case the BusinessItemDataContrect, and not the collection itself. So, even if we modified the autogenerated class to make the it a partial class any DataMember properties would not be serialized. Further, any manual updates to an autogenerated class would need to be reapplied after the next time the class is regenerated.

CollectionType data contracts are well documented here if you’d care to look further:
http://msdn.microsoft.com/en-us/library/aa347850(VS.85).aspx

How To Debug Windows Service OnStart

The article, How to: Debug Windows Service Applications details how to attach to a running process to debug a service. This works when the code you're debugging a logical issue rather than finding the root cause of an early failure when the service starts up. Debugging the OnStart method can be problematic. According to the article:

One way to work around this is to create a temporary second service in your service application that exists only to aid in debugging. You can install both services, and then start this "dummy" service to load the service process. Once the temporary service has started the process, you can then use the Debug menu in Visual Studio to attach to the service process.

After attaching to the process, you can set breakpoints and use these to debug your code. Once you exit the dialog box you use to attach to the process, you are effectively in debug mode. You can use the Services Control Manager to start, stop, pause and continue your service, thus hitting the breakpoints you've set. You would later remove this dummy service after debugging is successful.

Rather than creating a new service solely for debugging purposes, you can use the System.Diagnostics.Debugger.Launch method. In order to use this approach the service must be compiled for Debug and the PDB file must be available and preferably in the same directory as the installed service. Don't forget that you don't need a MSI to install the service. Running installutil.exe against a DLL containing an Installer for the service is sufficient.

After you install and start the service, on Vista, you are presented with the following dialog:

Debug JIT Dialog

After selecting, "Yes, debug MyNewService.exe" you will be presented with a dialog asking whether or not you want to grant or deny access. Accessing services requires administrator level access. Then, you're given the option to load the IDE:

Launch IDE

Finally, the Visual Studio .NET IDE will load and with the breakpoint set to the line with Debugger.Launch.

Debugger

Just make sure to remove this line of code before you check the service back into a source control system. This is something that should never go into a production environment. 

While writing this entry I also came across a useful post documenting five different ways to debug a Windows service:

Five Ways to Debug a Windows Service

Using a FM Transmitter with a Zune

Like many geeks, I'd like the ability to play music anywhere in my house from the music collection that I have aggregated on a PC sitting in my home office. The only thing special about this PC is that it has the Zune client installed. It's not a Window Media Center PC. Nonetheless, I wanted a solution that allows me to play music hosted on a PC on the second floor of my home throughout the rest of the house. A friend of mine clued me into a device from C. Crane. Their Digital FM Transmitter accepts input through a standard headphone jack and broadcasts over adjustable FM frequencies and retails for $69.95 at the time of this writing when purchased directly from C. Crane.

There are other FM tranmitters available like the Belkin: TuneCast II which retails for $39.99, but the broadcast range is 10-30 feet. Belkin recommends that it is within 10 feet to minimize interference. I'd like a longer range than that for broadcasting throughout my house.

Tuning 
Finding the right frequency can be a challenge. A frequency that is not only unused, but also 1 Mhz or so away from any active channel in either direction is ideal. Conveniently, the Radio Locator site can be used to find both used and unused frequencies by zip code. Once you found a viable frequency you'll need to tune your receiver. After trying to use this with both an analog and a digital receiver I much prefer using a digital reciever. The additional precision becomes a necessity when you have a narrow range of unused frequencies between two active frequencies.

There's an interesting post on Amazon on how to increase the range by decreasing the setting of an internal resistor. In order to access this resistor you need to unscrew the underside of the device. Modification voids the warranty. Use at your own risk:
How to Fix the Range Problem 

Positioning
Now that I have a good set of frequencies to try, it's time to find the best location for the transmitter. I attached the device to the head phone jack of my laptop, positioned myself close to the antenna attached to my home theater reciever and validated that I could, indeed, hear the dulcet tones of Ministry. I figured I'd start with something loud. It worked. Radio Free John was on the air.

I then started to wander about until the signal was lost, found, and lost again. Unsurprisingly, it works best when the transmitter antenna and the receiver antenna are within line of sight. Ideally, I wanted it to work on the second floor and so up I went and lo, there was still music. It didn't come in as clearly as on the first floor. I did have to adjust the antenna of the receiver a bit. So, the broadcast works fine for the home theater system. Most of the battle is won. I also have a small boom box that tends to wander between the basement and the kitchen. I was able to pick up the signal in the kitchen, but was out of luck in the basement.

Going Quasi-Mobile
So I'm left with coverage over 2/3rds of the house. Not a bad deal for a $70 purchase. There are times I do want to be in the basement. There's a dart board down there and the games are much better with a beer and some music. This solution addresses the latter.

The device comes with a DC adapter and also has the option of using two AA batteries. So, for late night dart games, the Zune takes the place of my home office PC and my tunes are available down in the bowels of the house. With two fresh batteries the transmitter is unfettered and I'm off to the basement for a round of criket.

Zune and FM Transmitter

At this point I am quasi-moblie. I can roam the house, but globetrotting is not much of an option. I took it along for an extended car trip hoping to broadcast to my car radio from the passenger seat using this arrangement. I did not meet with success. Even if I managed to get a signal as I left my house I would have had to retune when crossing into areas with different radio stations.

Since I did have it on the road, I decided to give it a shot with the digital clock radio in the hotel room. So, I checked for the ideal unused frequency at Radio Locator, tuned the transmitter to an unused frequency and adjusted the clock radio using the analog dial. After some adjustments I was able to pick up the transmission. However, the range was only a foot or two. The clock radio has no external antenna.

Zune Table

Summary
My experiences with the device can be summed up with these observations:

- Place the broadcast source at a high, central location in the house to maximize the range.
- Antenna size and positioning matters.
- Don't expect it to work in your car

It's not a Media Center PC, but it is an inexpensive solution for broadcasting music from a central PC or from your favorite MP3 player.

Posted by jiwasz@microsoft.com | 0 Comments
Filed under:

Planning for Windows Home Server with the Drobo

For several months now I've been considering putting together a home network that meets the following requirements:

  • Centralized back up - computers, media files, and other documents are backed up to and available to be restored from a single location
  • Access to media from any machine - ability to play music and video from any computer and TV
  • Easy interface - interface for playback should be usable by the non-computer saavy
  • Remote access - capability to send and retrieve files on home network while on the road

I recently read a blog post by a colleague discussing an environment that meets this requirements:

Architecture of the Charran eHome

The central component is Windows Home Server (WHS). I was intrigued by the Drobo when I read about it in the This Week in Photography (TWIP) blog entry: Mini Review of the Drobo. This appeared to be an either-or decision as WHS doesn't support any drives with RAID functionality. The Drobo uses its own striping approach which allows for 3/4 of disk utilization rather than the 1/2 you get with RAID configurations. Nonetheless, it's close enough to a RAID-solution to warrant concern. That is, until I read this blog from someone successfully using a Drobo with WHS:

Drobo + Windows Home Server = Goodness

This blog recommends that if you are using the Drobo then make it the only storage you're using so your WHS would have a small primary hard drive. It's primary job is to be a file server. 

The minimum requirements for WHS are:

Computer with 1 GHz Pentium III (or equivalent) or faster processor
512 MB of RAM or more
70 GB or larger ATA, SATA, or SCSI hard drive as the primary hard drive and any number of additional hard drives of any size 

An old machine collecting dust in the closet could suit the purpose keeping the cost down to that of the Drobo. Or, if a green PC is more to your liking the Everex gPC2 meets the requirements for $200. It comes installed with Linux, but can be repurposed.

Unfortunately, my work leaves me with little time to tinker at home. Updates to this effort will come over the next several months.

Additional Resources

Windows Home Server Team Blog 
Microsoft Windows Home Server Unleashed
MS Windows Home Server - Blog is written by a Microsoft MVP, but is not an official Microsoft resource

This Week in Photography - Mentioned earlier in the blog. This is a good resource for digital photographers which first clued me in to the Drobo.

Recover Documents from MOSS 2007 Database

Introduction

A couple years ago I wrote a prior post on how a simple VBS script can be used to extract a document from a SharePoint 2003/WSS 2.0 database (Recover Documents from SharePoint 2003 Database). After seeing the traffic routing to the post and with the adoption rate of MOSS 2007 it looks like an update is necessary.

This approach obviates the need to restore a content database into a MOSS 2007 environment if the intention is to extract a few critical documents.

Disclaimer

A few disclaimers are necessary. This script goes directly against a MOSS 2007 content database which is generally discouraged. Any code using the database directly will not be supported by MS Product Support Services. Instead, the SharePoint and WSS APIs are the way to go. If the database is modified directly rather than through the published SharePoint and WSS APIs Product Support Services cannot properly troubleshoot any unexpected issues. This script reads from the database so we're safe from errant modifications. However, the data structure the script queries could change in a future service pack.

Script

With that out of the way, let's proceed.  

The script queries the dbo.AllDocs table which contains the application documents and retrieves the most current version based on the document name which is then streamed out as binary data to a file:

Dim contentDatabase
Dim leaf
Dim outputPath

server = "[SERVERNAME]"
contentDatabase = "[CONTENTDATABASE]"
leaf = "[LEAFNODE]"
outputPath = "[OUTPUTPATH]"

ExtractDoc server, contentDatabase, leaf, outputPath

Sub ExtractDoc(server, contentDatabase, leaf, outputPath)

  Dim conStr, selectStr

  conStr = "Provider=SQLOLEDB;data Source=" + server + ";Initial Catalog=" + contentDatabase + ";Trusted_Connection=yes"

  selectStr = "SELECT dbo.AllDocStreams.Content FROM dbo.AllDocs "
  selectStr = selectStr + "INNER JOIN dbo.AllDocStreams "
  selectStr = selectStr + "  ON dbo.AllDocs.ID= dbo.AllDocStreams.ID "
  selectStr = selectStr + " AND dbo.AllDocs.Level = dbo.AllDocStreams.Level "
  selectStr = selectStr + " where LeafName='" + leaf +"' AND IsCurrentVersion=1"

  Set cn = CreateObject("ADODB.Connection")
  Set rs = CreateObject("ADODB.Recordset")
  cn.Open conStr
  Set rs = cn.Execute(selectStr)
  Set mstream = CreateObject("ADODB.Stream")
  mstream.Type = 1
  mstream.Open
  mstream.Write rs.Fields("Content").Value
  mstream.SaveToFile outputPath, 2
  rs.Close
  cn.Close
End Sub

Copy this code into Notepad and replace [SERVERNAME], [CONTENTDATABASE], [LEAFNODE] and [OUTPUTPATH] with appropriate values. Save this file as a VBS script and execute from the command line as:

C:\>CSCRIPT ExtractDoc.vbs

The SQL Query is a bit more complicated than the SharePoint 2003 version. It joins the dbo.AllDocs table with the dbo.AllDocStreams table which actually contains the blob Content field. There is also a dbo.AllDocVersions table, however, with versioning enabled this table does not appear to be updated as new versions are added. With each new version a new row is added to both the dbo.AllDocs and dbo.AllDocStreams tables. Conveniently, there is an IsCurrentVersion boolean field in the dbo.AllDocs table. The join between the dbo.AllDocs and dbo.AllDocStreams is done between the mutual uniqueidentifier ID fields and a Level field which appears to increment with each new version.

The LeafNode is the name of the file to retrieve. This sample script assumes that the document is in the root of the containing document library. If it were in a subdirectory an addtional DirName would need to be used in the query and passed as a parameter.

NOTE: This was tested with a MOSS 2007 content database. This was not tested with a WSS 3.0 content database, however, I expect the schema is the same. 

This is certainly not something to use in a production environment where automated document retrieval must be a repeatable and reliable process. But it is a quick and dirty means of extracting a document from a restored database that spares you from the overhead of restoring a SharePoint/WSS environment.

This was written using a simple VBS script so that production support folks can use it easily without having to compile a .NET assembly.

References

How to recover SharePoint document once deleted from recycle bin - He talks about using textcopy, which is part of the SQL Server 2000 Resource Kit, to perform the same task. In addition he has a number of screenshots and additional instructions which may prove helpful. The blog entry mentions that, although textcopy is not officially supported for use with SQL Server 2005, it does work. The kit is available for download if you are a MSDN subscriber. Otherwise, it's included in the book,

Generic DataContract Serializer or The Last DataContract Serializer on Earth

Alternate post title is written with apologies to  Vicent Price.

 When working with any technology that requires a contract driven approach, like WCF, there is often a need to serialize and deserialize objects if they don't come from a well know source exposing a complex type. You may find yourself in the position of having to read from disk or write an object to disk. Development of a front end may need to be created in parallel with the service that will supply it with populated data contracts. Until that service is available stubbed XML files may be user to populate predefined data contracts until the service becomes available. Code similar to the following is required:

public MyDataContract ReadObject(string objectData)

  MyDataContract deserializedObject = default(T);
  using (StringReader reader = new StringReader(objectData))
  {
     XmlTextReader xmlReader = new XmlTextReader(reader);
     XmlSerializer ser = new XmlSerializer(typeof(MyDataContract)); 
     deserializedObject = (MyDataContract)ser.ReadObject(xmlReader, true);
     xmlReader.Close(); 
  }
  return deserializedObject;
}

With each schema that's stubbed out and deserialized a new ReadObject method is required. A more generic approach obviates this tendency:

using System.Xml;
using System.IO;
using System.Runtime.Serialization;

namespace
MyCompany.Serialization
{
  internal static class GenericDataContractSerializer<T>
 
{
     public static void WriteObject(T outputObject, string outputFile)
    
{
        using (FileStream writer = new FileStream(outputFile, FileMode.Create))
       

           DataContractSerializer ser = new DataContractSerializer(typeof(T));
           
ser.WriteObject(writer, outputObject); 
        } 
     }

     public
static T ReadObject(string objectData)
     {
        T deserializedObject = default(T);
        using (StringReader reader = new StringReader(objectData))
        { 
           XmlTextReader xmlReader = new XmlTextReader(reader);
           DataContractSerializer ser = new DataContractSerializer(typeof(T)); 
           deserializedObject = (T)ser.ReadObject(xmlReader, true);
           xmlReader.Close(); 
        } 
        return deserializedObject;
     } 
  }
}

The class is used as follows:

string locationText = GetFile("MyStubbedDataContract.xml");
returnValue = GenericDataContractSerializer<MyDataContract>.ReadObject(locationText);

GetFile is responsible to loading the stubbed file which could come from the file system or from the resources of the assembly. For what it's worth I prefer to keep the stubbed files as an embedded resource since it removes the need to move yet another file during deployment to a development server. By the time it goes to production the stubbed approach should have been abandoned in favor of a working service that returns data contracts dynamically populated with data.

If you're dealing with XML serialization, here's the analog class for a generic XML Serializer:

using System.Xml;
using System.IO;
using System.Xml.Serialization;

namespace
MyCompany.Serialization
{
  internal static class GenericXmlSerializer<T>
  {
    public static void WriteObject(T outputObject, string outputFile)
    {
      using (FileStream writer = new FileStream(outputFile, FileMode.Create))
      {
        XmlSerializer ser = new XmlSerializer(typeof(T));
        ser.Serialize(writer, outputObject);
      }
    }

    public static T ReadObject(string objectData)
    {
      T deserializedObject = default(T); 
     
      using (StringReader reader = new StringReader(objectData))
      {
        XmlTextReader xmlReader = new XmlTextReader(reader); 
        XmlSerializer ser = new XmlSerializer(typeof(T));
        deserializedObject = (T)ser.Deserialize(xmlReader);
        xmlReader.Close();
      }

      return
deserializedObject;
    }
  }
}

 

Flickr and .NET

I recently came across this article: Flickr-ing about with .NET. Anyone who has uploaded photos to Flickr or has an interest in the site can programmatically walk the site with this API. It's rather straight forward. The difficulty lies in coming up with a need for an application. I don't have all of my photos on one computer and walking. My Flickr site has shots taken over two years ago that currently reside on PCs that have gone into a closet and are collecting dust. It would be easier to pull photos from my flickr site rather than going back to those PCs. Perhaps a backup routine could store a local copy of the photos on a personal Flickr site. It could also be used to search for odd criteria. JPG Magazine accepts photo submissions for publication with the stipulation that they must be over 2200 pixel along its longest dimension. It's currently not possible to look for photos matching this criteria in Flickr's default interface. With that, we have two needs:

  • Back Up existing photos to local storage
  • Search for photos by obscure criteria

Inferring SharePoint 2007 Web Service Parameters

The SharePoint 2007 SDK documentation for the Beta2TR is currently lacking in details around invoking the web services. One approach you can use is to use the .NET Reflector to crack open the assembly and see what the service in question does with the parameters once recieved. In this example, let's take a look at the StartWorkflow method of the Workflow web service. The web services deployed with SharePoint are in the C:\Program Files\common files\microsoft shared\web server extensions\12\ISAPI directory, including the Workflow web service, and accessible through the _vti_bin virtual director of your SharePoint site (e.g. http:\\servername\_vti_bin\webservice.asmx). Opening the workflow.asmx file in notepad.exe reveals that the work is done in the Microsoft.Office.WorkflowSoap assembly. Two locations I've searched for SharePoint assemblies are either across all  subdirectories of C:\Program Files\common files\microsoft shared\web server extensions\12 and the GAC. In this case, the Microsoft.Office.WorkflowSoap.dll, is located in the ...\12\Config\bin so once we load it into Reflector, we can see what the StartWorkflow method really does. The current Beta2TR SDK shows the following:

public XmlNode StartWorkflow ( string item, Guid templateId, XmlNode workflowParameters )

Walking through the dissassembled code in Reflector shows that it uses the internal WorkflowImpl class to then invoke the StartWorkflow method on the SPWorkflowManager class and that the first parameter is the URL to the file on which the workflow will perform, the second is the GUID of the SPWorkflowAssociation object defining the association between the hosting document library and a workflow and, finally, the third parameter is the data that's passing into the workflow upon initialization.

While we might have been able to infer these parameters from reading the StartWorkflow methods defined in the SDK on the Workflow web service and on the SPWorkflowManager and taking an educated guess as to how they're invoked, walking through the disassembled code removes all doubt. Clearly, the same technique can be applied to any of the other web services.

Infopath Form Serialization and Schema Names

If you are working with SharePoint 2007 workflows, then you've probably had a need to serialize and deserialize InfoPath forms xml data based on the instructions here: How to: Access Association and Initiation Form Data in a Workflow. I've found that running the xsd.exe on the myschema.xsd exported in the source files of an Infopath form results in a root class defined as follows:

...
[System.Xml.Serialization.XmlRootAttribute(Namespace="http://schemas.microsoft.com/office/infopath/2003/myXSD", IsNullable=false)]
public partial class myFields {
...

That myFields class name irks me a bit. The instructions referenced in the HowTo article above includes the following text to address this:

Specifying a unique name for the form fields collection, rather than using the default name of myfields, helps ensure the class that is generated from the form schema file also has a unique name. This is especially important when you are programming a workflow that uses multiple forms.

However, you might not have control over the form design if you are not creating the forms yourself. Rather than editing the code in the generated scheme, you can alter the name of the myFields class inside the InfoPath form before generating your concrete class with xsd.exe.

...
[System.Xml.Serialization.XmlRootAttribute(Namespace="http://schemas.microsoft.com/office/infopath/2003/myXSD", IsNullable=false,ElementName="myFields")]
    public partial class InitWorkflow {
...

Specifying the ElementName in the XmlRootAttribute maintains compatibility with the xml document while allowing for a more logical class name. The price paid lies in the alteration of a tool-generated class. If xsd.exe is run against the myschema.xsd source file, then this manual change will need to be re-applied.

Recover Documents from SharePoint 2003 Database

This will not work with MOSS 2007. If you are looking for a MOSS 2007 solution for recovering document go here: 

Recover Documents from MOSS 2007 Database

At some point a file will be accidentally deleted from a SharePoint library. There are tools to guard against unintentionally file deletions such as the Recycle Bin for SharePoint 2003, but it's not installed with the product out of the box. Incidentally, MOSS '07 comes with a recycle bin. However, with SPS 2003/WSS 2.0 if a user deletes a file most often the recovery path is to restore a backup copy of the appropriate content database and extract the file. Building the backup environment can be time consuming. Instead, the file can be extracted by going directly against a restored copy of the content database with a simple VB Script. This approach is rather old school. It could have been done with a .NET exe to do the job, but it takes so little code that it didn't seem worthwhile to put together a compiled assembly and include parameter handling logic.

A few disclaimers are necessary. This script goes directly against a SharePoint 2003/WSS 2.0 content database which is generally discouraged. Any code using the database directly will not be supported by MS Product Support Services. Instead, the SharePoint and WSS APIs are the way to go. If the database is modified directly rather than through the published SharePoint and WSS APIs Product Support Services cannot properly troubleshoot any unexpected issues. This script reads from the database so we're safe from errant modifications. However, the data structure the script queries could change in a future service pack. This will not work with MOSS 2007. The need for this kind of approach should be reduced with the introduction of the MOSS 2007 Recycling Bin.

With that out of the way, let's proceed. The script queries the Docs table which contains the application documents and retrieves the most current version based on the document name which is then streamed out as binary data to a file:

Set cn = CreateObject("ADODB.Connection")
Set rs = CreateObject("ADODB.Recordset")
cn.Open "Provider=SQLOLEDB;data Source=[SERVERNAME];Initial Catalog=[DATABASE_NAME];Trusted_Connection=yes"
Set rs = cn.Execute("SELECT Content FROM Docs WHERE LeafName = 'FILENAME.EXT'")
Set mstream = CreateObject("ADODB.Stream")
mstream.Type = 1
mstream.Open
mstream.Write rs.Fields("Content").Value
mstream.SaveToFile "C:\FILENAME.EXT", 2
rs.Close
cn.Close

Copy this code into Notepad and ,naturally , replace your [SERVERNAME], [DATABASE_NAME], and FILENAME.EXT with appropriate values. Save this file as a VBS script and execute from the command line as:

C:\>CSCRIPT ExtractDoc.vbs

Note that the SQL Query is rather straight-forward. It simply looks for a file according to the LeafName which corresponds to the document name. The code assumes that there will be at least one row returned. It doesn't do any check should the query return zero rows. It's recommended that the query executed in the script is first tested in SQL Query Analyzer to ensure that you retrieve the expected single result. The same file name found in more than one document library or folder you'll have multiple results. There's no guarantee that the first file returned will be the file you want. Use other values in the Docs table to narrow the file down to one, then paste the desired query into the script.

This is certainly not something to use in a production environment where automated document retrieval must be a repeatable and reliable process. But it is a quick and dirty means of extracting a document from a restored database that spares you from the overhead of restoring a SharePoint/WSS environment.

Update: I recently came across a post for SPExplorer. It's a tool for walking a SharePoint 2003 database directly rather than through an API and related blog that mentions SP Explorer mentions that it has the ability to extract documents.

 
Page view tracker