VB XML Cookbook, Recipe 6: Writing an XSLT Transform in VB (Doug Rothaus)

VB XML Cookbook, Recipe 6: Writing an XSLT Transform in VB (Doug Rothaus)

  • Comments 1

Most XSLT programmers are familiar with this XSLT transform to copy an XML file.

<?xml version="1.0" encoding="utf-8"?>

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

    <xsl:output method="xml" indent="yes"/>

 

    <xsl:template match="@* | node()">

        <xsl:copy>

            <xsl:apply-templates select="@* | node()"/>

        </xsl:copy>

    </xsl:template>

</xsl:stylesheet>

This XSLT is commonly used for identity transforms as it allows you to copy an entire XML document and “touch” each XML node and attribute. If you add a matching template, then you can transform just that attribute or node that has a match in place. Unmatched nodes and attributes are simply copied.

We can also do this in Visual Basic with XML Literals (including LINQ to XML and XML Axis Properties). Our VB code will allow us to “touch” each node or element by recursively navigating through an XML document based on the following pseudo-code:

Starting with the root element, perform the following whenever you encounter a node

If the node is an element

If the element has attributes, transform or copy each attribute

If the element has child nodes, transform or copy each node

If the node is text, transform or copy the text

If the node is CData, transform or copy the CData

If the node is a comment, transform or copy the comment

If the node is a processing instruction, transform or copy the processing instruction

 

For this cookbook entry, we’ll create an abstract (MustInherit) base class that performs this pseudo-coded recursive navigation of an XML document. We can then create a class that inherits from that base class to perform specific transforms. First, we’ll create the abstract class and the “starting point,” a function called Transform that takes the XML document (XDocument) to be transformed as input and returns the transformed document.

 

Public MustInherit Class VBXmlTransform

 

  Public Overridable Function Transform(ByVal xmlDoc As XDocument) As XDocument

    Return <?xml version="1.0" encoding="utf-8"?>

           <%= ProcessElement(xmlDoc.Root) %>

  End Function

 

End Class

 

Next, we add the logic that is called for each XML node (XNode) encountered. This includes elements, text, CData, and so on. Our code needs to determine the type of XML node and call the related function to transform or copy the node type and return the result, which is either a copied or transformed node. This method is called ProcessNode and is shown here.

 

  Public Overridable Function ProcessNode(ByVal xmlNode As XNode) As XNode

    ' This method ignores DTD (XDocumentType) content.

 

    Dim nodeType = xmlNode.GetType()

 

    ' Because XCData inherits from XText, check for the XCData type before checking

    ' for XText.

    If nodeType Is GetType(XCData) Then Return ProcessCData(xmlNode)

    If nodeType Is GetType(XText) Then Return ProcessText(xmlNode)

    If nodeType Is GetType(XElement) Then Return ProcessElement(xmlNode)

    If nodeType Is GetType(XComment) Then Return ProcessComment(xmlNode)

    If nodeType Is GetType(XProcessingInstruction) Then Return _

      ProcessProcessingInstruction(xmlNode)

 

    Return xmlNode

  End Function

 

Next, we can add the strongly-typed functions that process each of the node types as well as attributes. The function to process an element is unique, so we’ll leave that out for now and cover that next. The functions to process the other node types and attributes are rather simple. Because the default behavior of the base class is to simply copy a document, each function just returns the input value. The reason that we have created this code is to provide strongly-typed functions that we can override in our inheriting class with specific behavior. Here are the strongly-typed functions (without the ProcessElement function).

 

  Public Overridable Function ProcessAttribute(ByVal xmlAttribute As XAttribute) As XAttribute

    Return xmlAttribute

  End Function

 

  Public Overridable Function ProcessCData(ByVal xmlCData As XCData) As XCData

    Return xmlCData

  End Function

 

  Public Overridable Function ProcessText(ByVal xmlText As XText) As XText

    Return xmlText

  End Function

 

  Protected Overridable Function ProcessComment(ByVal xmlComment As XComment) As XComment

    Return xmlComment

  End Function

 

  Public Overridable Function ProcessProcessingInstruction( _

    ByVal pi As XProcessingInstruction) As XProcessingInstruction

 

    Return pi

  End Function

 

Now let’s look at the ProcessElement function. Processing elements is unique because elements can have both attributes as well as child nodes. Those attributes and child nodes need to be transformed or copied as well, so we must provide code that calls the ProcessAttribute function for each attribute, and calls the ProcessNode function for each child node. We’ll encapsulate this code in a function called CopyElement. The ProcessElement function will look like the other strongly-typed functions, except that it will return a call to the CopyElement function instead of just the input element. The CopyElement function uses XML Literals, embedded expressions, and LINQ to XML to create the copy of the XML element as shown here.

 

  Public Overridable Function ProcessElement(ByVal xmlElement As XElement) As XElement

    Return CopyElement(xmlElement)

  End Function

 

  Public Overridable Function CopyElement(ByVal xmlElement As XElement) As XElement

    Return <<%= xmlElement.Name %>

             <%= From attribute In xmlElement.Attributes() _

                 Select ProcessAttribute(attribute) %>>

             <%= From node In xmlElement.Nodes() _

                 Select ProcessNode(node) %>

           </>

  End Function

 

That’s the extent of our abstract class. Now we can make use of it for very simple or very complex identity transforms. Let’s see an example.

Creating a Transform

 

Our example will use the same source XML file that was posted with the Recipe 1 post. It has mixed content from several namespaces, providing a nice testing sample. The mixed content in our sample file is found in the <AdditionalContactInfo> element identified in the http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactInfo schema. In the additional contact info, there can be address information, which consists of three different element names: homePostalAddress, physicalDeliveryOfficeName, and registeredAddress. The addressType type has a required element named PostalCode. We can create a simple class to transform the <PostalCode> element and rename it to <ZipCode>.

 

First, we need to import the different schemas found in our source document. The abstract VBXmlTransform class does not need to know about these schemas, which are specific to the source document, but our inheriting class does.

 

Imports <xmlns="http://SampleSchema/AWContacts">

Imports <xmlns:aci="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactInfo">

Imports <xmlns:act="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactTypes">

Imports <xmlns:crm="http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactRecord">

 

Next, we create a class that inherits the VBXmlTransform class. In this class, called AWTransform, we override the base class and add the code to perform whatever transform we want. In this case, we’ll override ProcessElement, because we’re searching for any element named PostalCode. If we find that element, we’ll transform it. If not, we’ll defer to the ProcessElement method of the base class.

 

Class AWTransform

    Inherits VBXmlTransform

 

    ' Rename <act:PostalCode> to <ZipCode>.

    '

    ' Create an XName object to use for comparisons. This will perform better than comparing

    ' xmlElement.Name.LocalName to a string.

 

    Private postalCodeXName As XName = _

      XName.Get("PostalCode", _

                "http://schemas.microsoft.com/sqlserver/2004/07/adventure-works/ContactTypes")

 

    Public Overrides Function ProcessElement(ByVal xmlElement As XElement) As XElement

        Select Case xmlElement.Name

            Case postalCodeXName

                Return TransformPostalCode(xmlElement)

            Case Else

                Return MyBase.ProcessElement(xmlElement)

        End Select

 

        Return Nothing

    End Function

 

    Public Function TransformPostalCode(ByVal postalCodeElement As XElement) As XElement

        Return <ZipCode><%= postalCodeElement.Value %></ZipCode>

    End Function

End Class

 

Transforming the Document

 

To call this transform, we create an instance of our inheriting class, AWTransform, and then pass the source XML document to the Transform method as shown here:

 

        Dim xmlPath = My.Application.Info.DirectoryPath & "\..\..\AWContacts.xml"

        Dim savePath = My.Application.Info.DirectoryPath & "\..\..\TransformSave.xml"

 

        Dim xmlDoc = XDocument.Load(xmlPath)

 

        Dim transform As New AWTransform()

 

        Dim transformedDoc = transform.Transform(xmlDoc)

        transformedDoc.Save(savePath)

Other Examples

 

Let’s look at some other examples we could add to our AWTransform class.

 

The following code example shows how we can transform the content of an existing attribute. If the transform finds an attribute named date, it transforms the date value into the general date and time format.

 

    Private dateXName As XName = XName.Get("date")

 

    Public Overrides Function ProcessAttribute(ByVal xmlAttribute As XAttribute) As XAttribute

        If xmlAttribute.Name.Equals(dateXName) Then Return TransformDateAttribute(xmlAttribute)

 

        Return MyBase.ProcessAttribute(xmlAttribute)

    End Function

 

    Public Function TransformDateAttribute(ByVal dateAttribute As XAttribute) As XAttribute

        Dim dateValue As New DateTime()

        If DateTime.TryParse(dateAttribute.Value, dateValue) Then _

            dateAttribute.Value = dateValue.ToString("G")

 

        Return dateAttribute

    End Function

 

The following code example shows how we can remove data from the transformed document. If the transform finds a CData section, it returns Nothing so that the CData section is not included in the resulting document.

 

    Public Overrides Function ProcessCData(ByVal xmlCData As XCData) As XCData

        Return Nothing

    End Function

 

 

Leave a Comment
  • Please add 1 and 5 and type the answer here:
  • Post