Welcome to MSDN Blogs Sign in | Join | Help

Using XML Source

Providing a generic way to extract data from XML documents of any format is pretty complicated. The SSIS XML Source Adapter tries to do just that, to various degrees of success.

XML Source is easiest to use when your input has a simple element/sub-element structure:

<rootgoo>
    <goo>
        <subgoo>value</subgoo>
        <moregoo>1</moregoo>
    </goo>
    <goo>
        <subgoo>value</subgoo>
        <moregoo>2</moregoo>
    </goo>
</rootgoo>

Or when the values are listed as attributes (such as the output you get from SELECT ... FOR XML RAW statements in SQL Server).

<root>
   <row CustomerID="1" TerritoryID="1" AccountNumber="AW00000001" />
   <row CustomerID="2" TerritoryID="1" AccountNumber="AW00000002" />
</root>

Both of these XML documents would produce a single output in the XML Source. When your XML data starts to get more complicated (multiple levels of elements / attributes), you start getting more outputs. These outputs are linked with IDs, which you will have to join yourself with a merge join transform.

Dealing with multiple outputs

In this completely made up example. the XML is a little more complex.

<extract date="2007-12-05">
    <counters>
        <counter category="dispatcher" name="server1">
            <runtime>6</runtime>
            <queue>3</queue>
            <maxrequest>8</maxrequest>
            <color>blue</color>
            <host>
                <name>svo2555</name>
                <path>\\dispatcher</path>
                <lastaccessed>2007-02-03</lastaccessed>
            </host>
        </counter>
        <counter category="gateway" name="server1">
            <runtime>1</runtime>
            <queue>10</queue>
            <maxrequest>10</maxrequest>
            <color>purple</color>
            <host>
                <name>svo2555</name>
                <path>\\gateway</path>
                <lastaccessed>2007-02-03</lastaccessed>
            </host>
        </counter>
    </counters>
</extract>

Let's see how XML Source treats this.

xmlsrc

As you see, it created three outputs - counters, counter, and host - and created _id columns to identify the rows. Our destination table wants all of this data in a single row, which means we'll have to merge.

image

First we'll mark that the data has been sorted already, using the generated _id columns as the Sort Key. This will let us merge without having to insert a Sort transform.

- Right click on the XML Source, and bring up the Advanced Editor

- Select the host output, and set the IsSorted property to True.

image

- Expand the host output, and then expand Output Columns. Select the _id field, and set the SortKeyPosition property to 1

image

Follow the same steps for the counter output. Make sure you set the SortKeyPosition value on the "counter_id" column, and not the "counters_id" column - we're going to ignore the counters output all together.

Add a Merge Join transform. Connect the host and counter outputs from the XML Source to it.

image

You can then open the merge join transform, and take the columns from both inputs.

image

Voila!

Published Tuesday, December 11, 2007 9:16 PM by mmasson
Filed under:

Comments

# XML Source - Making things easier with XSLT

As I wrote in my previous post , complex XML documents will produce multiple outputs when you're using

Saturday, December 15, 2007 11:36 AM by SSIS Stuff

# re: Using XML Source

How it can done using vb macro? can u explain

Tuesday, February 05, 2008 9:37 AM by SenthilS

# re: Using XML Source

Nice. I've never read this before. Shows how valuable a bit of out-of-the-box thinking can be.

-Jamie

Monday, March 03, 2008 4:32 PM by jamiet

# re: Using XML Source

hello there,

Just as your example had 3 outputs, one of the XML I am working on produced 5 outputs. So I tried using multiple "merge and joins". The outputs were merged fine, but when I try to give the outputs of the resultant merge and joins to the next one, en error shows up saying that

"The IsSorted property must be set to true on both sources of this transformation"

How do I set that on a merge and join transform ??

Friday, May 02, 2008 8:02 PM by kalcee

# re: Using XML Source

Hi Kalcee,

Looks like you might have missed a step. Take a look at the step that starts with "Right click on the XML Source, and bring up the Advanced Editor" ... you can force the IsSorted property to true on the "Input and Output Properties" page.

~Matt

Monday, May 05, 2008 7:05 PM by mmasson

# re: Using XML Source

Hi,

Great information, I can remove most of my manual sorts!  It will clear my task up a lot!

Quick question though, in your XML you have your main tag with a 'date' attribute, taking a look at your list of record sets you don't have a extract set.  I've run into the same problem as the parent tag has values in which I need.  Do you know how to access this data?

In my case it's for the report date, I need to know for which date the data in the XML is for, unfortunately the date is held within the main tag like in your example.

<extract date="2007-12-05">

   <counters>

The date is not accessible according to my SSIS

Cheers

Ian

Wednesday, May 21, 2008 8:12 AM by Telly

# re: Using XML Source

Hi Ian,

I noticed this too - it looks like our XML Source doesn't pick up attribute values from the root element.

Your best bet is to massage the data bit with XSLT before processing. See my other post about this here:

http://blogs.msdn.com/mattm/archive/2007/12/15/xml-source-making-things-easier-with-xslt.aspx

~Matt

Friday, May 23, 2008 5:51 PM by mmasson

# re: Using XML Source

As a side note to using XML as a source, there is an XML destination adapter for SSIS available at http://www.keelio.com

Sunday, June 15, 2008 11:19 PM by KeelioSoftware

# XML source and multiple data outputs for elements.

XML source and multiple data outputs for elements.

Saturday, October 25, 2008 11:54 AM by Gully Coder

# re: Using XML Source

Excellent! Looked all over to find a good and simple explanation of how to get the XML files into SQL Server

Saturday, October 25, 2008 10:02 PM by Arild - IMMIX

# re: Using XML Source

Hello,

I have a single XML file and I use your technique (great post by the way..).

But my problem is that I have more that two outputs coming from my XML Source Adapter.

How could I deal with this without having to "cascade" the merge join ?

Any help is welcome.

Thanks.

Wednesday, November 12, 2008 12:21 PM by bre

# re: Using XML Source

what if I don't want to merge all the data in one table, what if I want to have one to many tables with a FK key?

I did get the keys generated from SSIS but they don't seem work give right relationship.

for example I got a table called events which hold the callid and eventid and I was hoping it would have the relationship there, but nope.

the ids don't match in both tables, in fact I can't even find the eventid that is linked to the callid...

at this point I got my data mapped to my SQL tables fine, but with out a relationship. I can't tell how many events are for one call.

I am about to re do everything using C# and xml objects to import the data. it's hard to believe that this could not be done in SSIS.

please help.

Here is the example of the xml I am working with

- <calls>

- <call id="@BR8" callkey="1" date="09062008" acd="0" starttime="2118" endtime="2181" type="1" result="1" from="" to="x82185 CUSTOMS" dnis="2623" ansby="x82185 CUSTOMS" requeues="0" intf="0" overflowed="0" transfd="0" held="0" confd="0" timetoabn="0" timetoans="6" tottalktm="57" totconftm="0" totholdtm="0">

- <events>

- <event starttime="2118" callkey="1" endtime="2118" type="1" ani="" dnis="" digits="82185" cause="0" montd="0" rqud="0" intf="0" overflowed="0">

 <trnk key="default" id="C04B20" name="" nick="" />

 </event>

- <event starttime="2118" callkey="1" endtime="2124" type="2" ani="" dnis="2623" digits="82185" cause="22" montd="0" rqud="0" intf="0" overflowed="0">

 <ext key="slap" id="82185" name="CUSTOMS" nick="" vport="0" autoattend="0" qcc="0" />

 <ext key="default" id="82185" name="CUSTOMS" nick="" vport="0" autoattend="0" qcc="0" />

 <trnk key="from" id="C04B20" name="" nick="" />

 </event>

- <event starttime="2124" callkey="1" endtime="2181" type="3" ani="" dnis="2623" digits="" cause="0" montd="0" rqud="0" intf="0" overflowed="0">

 <ext key="slap" id="82185" name="CUSTOMS" nick="" vport="0" autoattend="0" qcc="0" />

 <ext key="default" id="82185" name="CUSTOMS" nick="" vport="0" autoattend="0" qcc="0" />

 </event>

 <event starttime="2181" callkey="1" endtime="2182" type="22" ani="" dnis="" digits="" cause="0" montd="0" rqud="0" intf="0" overflowed="0" />

 </events>

 </call>

Thursday, December 18, 2008 10:32 PM by rperetz
Anonymous comments are disabled
 
Page view tracker