Almost every application can benefit from persisting data into a local and portable data repository. For example, I worked for the tools group in Microsoft's IT department writing internal software tools when Microsoft first released Access. I would use Access to store user and error information for my applications while they were in beta. I remember one specific application that would "crash" the same time each day and by looking at the Access db, I found what the error was and more importantly, who had been using it when it crashed. It was a simple matter to call that individual and get a better understanding of what they had been doing and why so I could code around the mistake. I've also used this approach with external customers and simply asked them to email me the Access .mdb file so I could determine what the problems were.
Visual Studio contains a lightly documented tool called the MsDatasetGenerator (there is a similar but less robust tool in the .NET SDK called xsd.exe) that allows you to do similar things with xml files. Much like Access, these xml files are portable and can be used across several applications. Let me describe some of the ways I've used these and then provide a step-by-step to show you how to get started.
I run Exchange 2003 on my home network and since my daughters are in college, they use OWA to access their mail and so the cool Outlook 2003 Junk Email filter doesn't catch their spam. So, I decided to write my own Spam filter using .NET and the new SCL in Exchange. I won't go into details here of that application, but suffice it to say that I use rather resource-intensive Reverse DNS to determine if a message is from a spammer. If so, I add their IP to the xml "database". In that way, the application knows that the IP Address is from a probable spammer and in future requests (until the TTL expires), we don't do the DNS lookup again - just check the xml database. I have shared this application with Exchange hosting orgs and they found as I did, that the performance impact was negligible. This attests to the power and speed of these XML "databases" even when there are thousands of entries as I have in my spam xml database.
I later enhanced the application to tell me the reason (I have several methods I use to mark the SCL) for the detection as well as to increment the hit count from that IP. This allowed me to write another application that used the same xml file to evaluate the spam hits on the server. I found that over a short period of time, spammers will flood my server from several bogus domains but using the same rather small set of IP Addresses. So, I wrote a third application that would use IPSec to block the IPs at the network stack. This application would read the Exchange Spam XML file and organize the hits and display them. This third app would then let me mark which ones I wanted to block and it would store them in another xml database file for the IPSec filter. That way, the block list would load each time I ran the app or rebooted the server.
The point is, that these files are quick, easy, portable, and an easy way to share data between applications.
Oh, and if any of these apps sound useful, reply with a comment and I’ll write another Blog with a pointer to the code which you can download and use.
OK, so let’s look at some code.
1. Open Visual Studio and create a C# Console application.
2. Next, using the Solution Explorer, create a New Xml file and call it: testXml.xml
3. Put in the following content and save the file:
<root>
<myData myDataStuff1="" myDataStuff2="" />
</root>
4. Next, in VS create the schema for the xml file (right-click over the .xml file and select Create Schema). You should now have a “testXml.xml” file and a “testXml.xsd” file.
- Open the Properties dialog for the testXml.xsd file. In the Custom Tool box, key in: MSDatasetGenerator. Enable the “Show all Files” so you can see the newly generated testXml.cs file.
6. Open the testXml.cs file and notice that over 300 lines of C# code has been generated for this three line xml file. Review the code and you will see that this tool has created code to support collections, events, etc.
- In the Main function (assuming a Console C# app), paste the following code:
// create an instance of the testxml dataset
testXml mytest = new testXml();
// you can load data from an existing xml file, this shows how easy it is to move data from
// one location or application to another.
// mytest.ReadXml(@"c:\myxmlfile.xml");
//or, you can simply populate the dataset and save it to disk
mytest.myData.AddmyDataRow("my first row data1", "my first row data 2");
mytest.myData.AddmyDataRow("my second row data1", "my second row data 2");
// can use the collections to interate through
foreach (testXml.myDataRow row in mytest.myData.Rows)
{
System.Diagnostics.Debug.WriteLine("Stuff 1: " + row.myDataStuff1);
System.Diagnostics.Debug.WriteLine("Stuff 2: " + row.myDataStuff2);
}
mytest.WriteXml(@"C:\myoutput.xml");
You’re done. With the tool generated code, you can easily add items, browse through collections, and move the dataset from one application to another, and you never have to use XPath or open the XML in your code!