Robert Lucero's Testing Blog

Thoughts on Software Testing, Development and the Technologies I work on

April, 2012

  • Robert Lucero's Testing Blog

    Quick Aside on XML - The Wrong Way to Use XML


    XML is a great way to collect and author data in a way that can be consumed by any number of applications.  There are some exceptionally powerful tools that can consume XML formats and enforce levels of data validation, but this is only useful if the implementation make sense.  After working on some legacy tools and trying to design some new test systems, I’ve realized that there’s a right way to use XML and a horribly wrong way to use XML.

    Wrong Way #1: Parsing XML With XElement

    Don’t get me wrong, XELement/XPath searching is a great way to look through XML when you’re just trying to retrieve specific values from generic XML files. 
    For Example:
    I want to find every instance within 10 types of XML files where server value is IIS6.0 and change it IIS8.0.

    XElement and XPath can quickly search through the XML files regardless of their structure (provided that they are valid XML files) and change those values.  But, the limitations of this solution are that your searching through the files generically and you’re treating XML as just a structured input type.  In addition, it just doesn’t scale.  Even with fancy regex/XPath queries you can only do so much.

    Wrong Way #2: Manually Translating XML to Objects

    Another quick fix solution I’ve seen implemented a number of times is using Wrong Way #1 to build up an object by hand.  First you create the object that you’re trying to build, then you populate the values of the object by reading values and setting them.  At this point, any way of extracting information from a file is just as good.  The one slight advantage of XML is that the APIs available can help query values. 

    This pattern’s translation layer can also lend itself to maintenance challenges because of written out step-by-step translation from XML document to object.  It also makes it hard to enforce an object-XML document mapping when the translation occurs.  The differences between the XML format or object changes are obscured by the translation layer:

    Figure 1: Don’t Do This.

    Wrong Way #3: Never Using a Schema

    If you’ve taken the time to plan out your XML document layout and designing an object model for how you’re going to use all that sweet XML data, you’ve probably also created a schema.  So, use it!  Schemas that are loaded into editors, like Visual Studio, can make manual authoring of XML much faster.  In addition, if you enforce your schema at document load time, you can ensure that you’re only loading documents that the system can handle.

    XML Done Right

    First, design the object model first.  The important logic is going to work with objects not with XML files or random values pulled from the file.  Make it strongly typed and serializable.  The people consuming your data will appreciate you for it.

    Second, let the XmlSerializer do the translation work for you.  It won’t mess up with misspelled node names or incorrect value translations (most of the time).  The benefit of this model is that when you want to save the object state all you have to do is deserialize the object out to a file.  Done.  An XML file is saved and available for later.

    Third, with the object model in hand you can quickly create a schema.  Again, with this schema you can enforce validation at load time and use the schema to get XML Intellisense when hand editing XML files.

    Hopefully by planning your XML/Object model with these tips in hand, you can save yourself a lot of trouble in maintenance and helpful in feature design.

    Figure 2: Do This

Page 1 of 1 (1 items)