XML Source - Making things easier with XSLT

XML Source - Making things easier with XSLT

Rate This
  • Comments 12

As I wrote in my previous post, complex XML documents will produce multiple outputs when you're using the XML Source adapter. Most of the time it will be easier to pre-process your source file with XSLT to de-normalize it a bit. Reducing the number of outputs greatly simplifies your data flow.

Let's take the same XML document I used in the last example:

<extract date="2007-12-05">
    <counters>
        <counter category="dispatcher" name="server1">
            <runtime>6</runtime>
            <queue>3</queue>
            <maxrequest>8</maxrequest>
            <color>blue</color>
            <host>
                <name>svo2555</name>
                <path>\\dispatcher</path>
                <lastaccessed>2007-02-03</lastaccessed>
            </host>
        </counter>
        <counter category="gateway" name="server1">
            <runtime>1</runtime>
            <queue>10</queue>
            <maxrequest>10</maxrequest>
            <color>purple</color>
            <host>
                <name>svo2555</name>
                <path>\\gateway</path>
                <lastaccessed>2007-02-03</lastaccessed>
            </host>
        </counter>
    </counters>
</extract>

We want to flatten this out a bit using an XSL transform like this one (forgive my novice XSLT skills):

<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="/extract">
    <xsl:variable name="extractDate" select="/extract/@date" />
    <counters>
    <xsl:for-each select="counters/counter">
        <counter>
            <extractDate><xsl:value-of select="$extractDate"/></extractDate>
            <category><xsl:value-of select="@category"/></category>
            <name><xsl:value-of select="@name"/></name>
            <runtime><xsl:value-of select="runtime"/></runtime>
            <queue><xsl:value-of select="queue"/></queue>
            <maxrequest><xsl:value-of select="maxrequest"/></maxrequest>
            <color><xsl:value-of select="color"/></color>
            <hostName><xsl:value-of select="host/name"/></hostName>
            <path><xsl:value-of select="host/path"/></path>
            <lastaccessed><xsl:value-of select="host/lastaccessed"/></lastaccessed>
        </counter>
    </xsl:for-each>
    </counters>
</xsl:template>
</xsl:stylesheet> 

We'll apply the transform with an XML Task. Add one to your package, and open the editor. You'll want to change the Operation Type property to XSLT, set SaveOperationResult to true, and set all of the file connections.

image

Note, the Source should be your XML source document and the SecondOperand is your XSLT document.

The processed XML looks like this:

<?xml version="1.0" encoding="utf-8"?>
<counters>
  <counter>
    <extractDate>2007-12-05</extractDate>
    <category>dispatcher</category>
    <name>server1</name>
    <runtime>6</runtime>
    <queue>3</queue>
    <maxrequest>8</maxrequest>
    <color>blue</color>
    <hostName>svo2555</hostName>
    <path>\\dispatcher</path>
    <lastaccessed>2007-02-03</lastaccessed>
  </counter>
  <counter>
    <extractDate>2007-12-05</extractDate>
    <category>gateway</category>
    <name>server1</name>
    <runtime>1</runtime>
    <queue>10</queue>
    <maxrequest>10</maxrequest>
    <color>purple</color>
    <hostName>svo2555</hostName>
    <path>\\gateway</path>
    <lastaccessed>2007-02-03</lastaccessed>
  </counter>
</counters>

Add a Data Flow Task, and setup your XML source to use the processed XML document. You'll need to update/regenerate the schema for your document to account for the new format. Notice there is now only one output to deal with.

Leave a Comment
  • Please add 2 and 6 and type the answer here:
  • Post
  • Thanks for the tip. I've recently been on a project where we needed to do this exact same thing but we hit a slight problem. We noticed that if the original XML file is larger than 100MB it would cause a System.OutOfMemoryException when it tries to transform the XML. Have you heard of this occurring?

  • Hi Tanner,

    I believe both the XML Task and XML Source read the entire XML document into memory before performing any operations. They were originally designed to work on smaller XML files.

    I'd suggest trying a similar operation using a script task, and see if that works better. You'll have more control over how the XSL is executed.

    ~Matt

  • Are there any plans to getting this upgraded to XSLT 2.0 and making it more robust?

    Mark

  • Hi Mark,

    (Wow, sorry for the delay! I completely missed this comment).

    Yes, improving the performance of our XSLT usage is one of our work items for the next release. Hopefully they'll make it in!

    ~Matt

  • Thanks for the article!

    Mohammad

  • Thanks for the article

    Few questions:

    1. Is it required to create the XSLT file, i just have an xml file how can create this xslt file.

    2. what is second operand?

    Regards,

    Eshwar.

  • Hi Matt,

    Is there a better way to create the xslt file and not manually?

    Thanks

  • Matt,

    I am trying to do the same thing, yanking data that's at the root level, but my file is also much simpler because I don't have sub element data. Can anyone here please help me generate the xslt file that I need to generate the final XML source file. Thanks

    <ResponseFile Date="2011-11-11" StatusCode="s"  MessageText="FAIL">

            <Response TranCode="15" ID="4444" BorrowerID="1101" Status="5"/>

            <Response TranCode="15" ID="4444" BorrowerID="7777" Status="5"/>

            <Response TranCode="15" ID="4444" BorrowerID="8888" Status="5"/>

       </ResponseFile>

  • I tried this one but it says

    "[XML Task] Error: An error occurred with the following error message: "There are multiple root elements. Line 482, position 2.".

  • Hi Matt

    In a case where i have multiple files, say 500 XML files. how would i convert them, because i have tried a For Each Loop and it's failing and im not getting any errors, somehow it just manage to process 1 file

  • Hi Matt,

    Can you please let me know how to proceed further to get the data from processesed xml to sql table as I am getting an error in doing so? It would be a great help. Really strugling to get it resolve. I am new to SSIS.

  • Where the generated xml files get stored by default

Page 1 of 1 (12 items)