December, 2008

  • Eric White's Blog

    A More Robust Approach for Handling XName Objects in LINQ to XML

    • 0 Comments

    One of the advantages of V2 of the Open XML SDK is that it provides us with a strongly typed XML document object model (DOM).  Because elements are represented by classes, and attributes are represented by properties in the classes, developers are not at risk of misspelling element and attribute names.  I was chatting about the Open XML SDK V2 with a customer, who appreciated the advantages of this approach.  However, V2 of the SDK is not an option for them, because it won’t be available to use to build a shipping product until the release of the next version of Office, and this doesn’t fit their schedule.  There is another approach using V1 of the SDK and LINQ to XML that provides some of the benefits of a strongly typed DOM.  The approach consists of declaring a static class with static, initialized XName and XNamespace fields in the class, and then using those XName and XNamespace objects in the LINQ to XML code to create and query XML trees.

    This blog is inactive.
    New blog: EricWhite.com/blog

    Blog TOC
    By using an approach of automatically generating (from XSD) the static classes that contain the names, it assures that the names and namespaces are not misspelled.  In addition, this approach has the benefit of pre-atomization of the XName and XNamespace objects.  I’ve written a few posts on the benefits of pre-atomization of XName and XNamespace objects:

    Writing Robust LINQ to XML Code that Performs Well

    Atomized XName and XNamespace Objects

    Preatomization of XName Objects

    Pre-atomization can yield performance benefits both when creating and querying XML trees.  The static members of the class are initialized at application startup time.  After startup, the application will not need to further atomize any XName or XNamespace objects.

    There is one additional advantage – you get some level of Intellisense support when you use this approach.

    The Open XML Markup has a fair number of element and attribute names – I don’t know the exact count.  As a data point, I queried a quite large Open XML document for distinct element and attribute names (the Ecma Open XML Specification Part 4), and counted 809 distinct qualified names in 21 namespaces.

    I wondered if pre-atomization of two or three thousand XName objects would present a performance problem on application startup, so I wrote a small program to generate a program that contains 3800 initialized XName objects.  I took care to generate names that are representative of the names in the Open XML specification – some are short, some are long, and the names are in a variety of namespaces.  When run on a fairly slow laptop (2 ghz, 1 core), the program initialization took approximately 1/2 of a second, which is reasonable.  Further on in this post, I present the small program that generates the test program.

    The following example shows the approach of declaring a couple of static classes that contains static initialized fields.  You can see that code in Main uses the initialized names in the static classes both for the construction of the XML tree, and in a query:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Xml.Linq;
     
    public static class Pkg
    {
        public static XNamespace ns_pkg =
            "http://schemas.microsoft.com/office/2006/xmlPackage";
     
        public static XName package = ns_pkg + "package";
        public static XName part = ns_pkg + "part";
        public static XName name = ns_pkg + "name";
        public static XName contentType = ns_pkg + "contentType";
        public static XName xmlData = ns_pkg + "xmlData";
    }
     
    public static class Ext
    {
        public static XNamespace ns_ext =
            "http://schemas.openxmlformats.org/officeDocument/2006/extended-properties";
     
        public static XName Properties = ns_ext + "Properties";
        public static XName Template = ns_ext + "Template";
        public static XName TotalTime = ns_ext + "TotalTime";
    }
     
    class Program
    {
        static void Main(string[] args)
        {
            var pkg = new XElement(Pkg.package,
                new XAttribute(XNamespace.Xmlns + "pkg",
                    "http://schemas.microsoft.com/office/2006/xmlPackage"),
                new XElement(Pkg.part,
                    new XAttribute(Pkg.name, "/docProps/app.xml"),
                    new XAttribute(Pkg.contentType,
                        "application/vnd.openxmlformats-officedocument.extended-properties+xml"),
                    new XElement(Pkg.xmlData,
                        new XElement(Ext.Properties,
                            new XAttribute("xmlns", Ext.ns_ext),
                            new XElement(Ext.Template, "Normal.dotm"),
                            new XElement(Ext.TotalTime, "1")
                        )
                    )
                )
            );
     
            var totalTime =
                (from e in pkg.Descendants(Ext.TotalTime)
                 select (int)e)
                .First();
     
            Console.WriteLine(totalTime);
        }
    }
     

    Here is the program that generates the test program:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Xml.Linq;
     
    class Program
    {
        static HashSet<string> usedNames;
     
        static void Gen(string className, string nsAbbr, string ns, int count)
        {
            Console.WriteLine("public static class {0}", className);
            Console.WriteLine("{");
            Console.WriteLine("    public static XNamespace {0} = \"{1}\";", nsAbbr, ns);
            Console.WriteLine();
           
            Random r = new Random();
     
            for (int i = 0; i < count; i++)
            {
                string s = "";
                // the following loop (using the usedNames HashSet) makes sure that
                // the code does not generate duplicate names.
                while (true)
                {
                    // make sure that the first character of the name is a letter A-Z
                    char startChar = (char)('A' + (int)(r.NextDouble() * 26));
                    Guid g = Guid.NewGuid();
                    int len = (int)Math.Floor(r.NextDouble() * 12 + 2);
                    s = startChar.ToString() +
                        g.ToString()
                         .Replace("-", "")
                         .Substring(0, len);
     
                    if (usedNames.Add(s))
                        break;
                }
                Console.WriteLine("    public static XName {0} = {1} + \"{0}\";", s, nsAbbr);
            }
            Console.WriteLine("}");
            Console.WriteLine();
        }
     
        static void Main(string[] args)
        {
            Console.WriteLine(@"using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Xml.Linq;
    ");
            usedNames = new HashSet<string>();
            Gen("Ns_W", "ns_w",
                "schemas.openxmlformats.org/wordprocessingml/2006/main", 1000);
            Gen("Ns_VE", "ns_ve",
                "http://schemas.openxmlformats.org/markup-compatibility/2006", 200);
            Gen("Ns_O", "ns_o",
                "urn:schemas-microsoft-com:office:office", 1000);
            Gen("Ns_R", "ns_r",
                "http://schemas.openxmlformats.org/officeDocument/2006/relationships", 600);
            Gen("Ns_M", "ns_m",
                "http://schemas.openxmlformats.org/officeDocument/2006/math", 1000);
            Console.WriteLine(@"
    public class Program
    {
        public static void Main(string[] args)
        {
            Console.WriteLine(""Entered Main"");
        }
    }
    ");
        }
    }
     

    The test program looks something like this:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Xml.Linq;
     
    public static class Ns_W
    {
        public static XNamespace ns_w = "schemas.openxmlformats.org/wordprocessingml/2006/main";
     
        public static XName A2dff9d51ac4 = ns_w + "A2dff9d51ac4";
        public static XName F362d21f0a = ns_w + "F362d21f0a";
        public static XName Ne594dbac67 = ns_w + "Ne594dbac67";
        public static XName X4cad35 = ns_w + "X4cad35";
        public static XName Ybca171196db = ns_w + "Ybca171196db";
        public static XName C41d0325b2028 = ns_w + "C41d0325b2028";
        public static XName D3e0eb73ab5a94 = ns_w + "D3e0eb73ab5a94";
        public static XName M5df215a698c = ns_w + "M5df215a698c";
        public static XName Zebe33a = ns_w + "Zebe33a";
        public static XName J8d2563d1a13 = ns_w + "J8d2563d1a13";
        // ...
    }
     
    public static class Ns_VE
    {
        public static XNamespace ns_ve = "http://schemas.openxmlformats.org/markup-compatibility/2006";
     
        public static XName Afd026fa849e = ns_ve + "Afd026fa849e";
        public static XName F206884417 = ns_ve + "F206884417";
        public static XName N867bd48925 = ns_ve + "N867bd48925";
        public static XName X801087 = ns_ve + "X801087";
        public static XName Yf601615604c = ns_ve + "Yf601615604c";
        public static XName C1701240984e3 = ns_ve + "C1701240984e3";
        // ...
    }
     
    public static class Ns_O
    {
        public static XNamespace ns_o = "urn:schemas-microsoft-com:office:office";
     
        public static XName A4a22d12b3d2 = ns_o + "A4a22d12b3d2";
        public static XName Fa7d35e114 = ns_o + "Fa7d35e114";
        public static XName N5154768163 = ns_o + "N5154768163";
        public static XName Xdf110a = ns_o + "Xdf110a";
        public static XName Yece0763cf81 = ns_o + "Yece0763cf81";
        public static XName C5322c3f14297 = ns_o + "C5322c3f14297";
        public static XName Dd1d69594fef44 = ns_o + "Dd1d69594fef44";
        public static XName Mbfb32a08d32 = ns_o + "Mbfb32a08d32";
        public static XName Z509498 = ns_o + "Z509498";
        public static XName J16db47997f9 = ns_o + "J16db47997f9";
        // ...
    }
     
    public static class Ns_R
    {
        public static XNamespace ns_r = "http://schemas.openxmlformats.org/officeDocument/2006/relationships";
     
        public static XName A9676b45e848 = ns_r + "A9676b45e848";
        public static XName F51ab6cb14 = ns_r + "F51ab6cb14";
        public static XName Ncfb544f931 = ns_r + "Ncfb544f931";
        public static XName X4f67fc = ns_r + "X4f67fc";
        // ...
    }
     
    public static class Ns_M
    {
        public static XNamespace ns_m = "http://schemas.openxmlformats.org/officeDocument/2006/math";
     
        public static XName A5993dd93a92 = ns_m + "A5993dd93a92";
        public static XName Fa5592d4c1 = ns_m + "Fa5592d4c1";
        public static XName N29869d8a94 = ns_m + "N29869d8a94";
        public static XName Xb80c69 = ns_m + "Xb80c69";
        // ...
    }
     
    public class Program
    {
        public static void Main(string[] args)
        {
            Console.WriteLine("Entered Main");
        }
    }
     

    Code is attached.

Page 5 of 11 (11 items) «34567»
Page 2 of 2 (11 items) 12