Another issues that came up many times on the discussion list : propagating identity values from parent to child when using SQLXMLBulkload. This is a feature that was introduced with SQLXML3 SP2 release and it's also available in SQL Server 2005.
Basically, the way it works is that bulkload identifies the primary key/foreign key relationship described in the schema file.Then, it will first insert the records in the table with the primary key and propagate the identity values generated by the SQL Server to the tables with the foreign key columns.
Here is a demo on how this works (this is an example from the sqlserver.xml newsgroup):
Consider the following database schema:
Pregnancies
------------
PregnancyID
FirstName
LastName
DateOfBirth
PregnancyFacts
----------------
PregnancyFactID
PregnancyID (FK)
FactDate
FactName
FactValue
CREATE TABLE [dbo].[Pregnancies](
[FirstName] [nvarchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[LastName] [nvarchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[DateOfBirth] [datetime] NULL,
[PregnancyID] [int] IDENTITY(1,1) NOT NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[PregnancyFacts](
[PregnancyID] [int] NOT NULL,
[FactDate] [datetime] NULL,
[FactName] [nvarchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[FactValue] [nvarchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,
[FactID] [int] IDENTITY(1,1) NOT NULL
) ON [PRIMARY]
The user has the following xml data :
<pregnancies>
<pregnancy>
<summary>
<firstName>Jane</firstName>
<lastName>Doe</lastName>
<dob>7/22/85</dob>
</summary>
<facts>
<fact>
<factDate>3/3/07</factDate>
<factName>Eye Color</factName>
<factValue>Brown</factValue>
</fact>
<fact>
<factDate>6/6/07</factDate>
<factName>Hair Color</factName>
<factValue>Brown</factValue>
</fact>
</facts>
</pregnancy>
<pregnancy>
<summary>
<firstName>Mary</firstName>
<lastName>Smith</lastName>
<dob>6/12/85</dob>
</summary>
<facts>
<fact>
<factDate>3/3/07</factDate>
<factName>Eye Color</factName>
<factValue>Blue</factValue>
</fact>
<fact>
<factDate>6/6/07</factDate>
<factName>Hair Color</factName>
<factValue>Blonde</factValue>
</fact>
</facts>
</pregnancy>
</pregnancies>
Here is the annotated XSD schema:
<?xml version="1.0" encoding="utf-8" ?>
<xs:schema xmlns:sql="urn:schemas-microsoft-com:mapping-schema"
xmlns="http://tempuri.org/XMLSchema.xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:annotation>
<xs:appinfo>
<sql:relationship name="PPF"
parent="Pregnancies"
child="PregnancyFacts"
parent-key="PregnancyID"
child-key="PregnancyID"/>
</xs:appinfo>
</xs:annotation>
<xs:element name="pregnancies" sql:is-constant="true">
<xs:complexType>
<xs:sequence>
<xs:element name="pregnancy" sql:relation="Pregnancies">
<xs:complexType>
<xs:sequence>
<xs:element name="summary" sql:is-constant="true">
<xs:complexType>
<xs:sequence>
<xs:element name="firstName" sql:field="FirstName" type="xs:string"/>
<xs:element name="lastName" sql:field="LastName" type="xs:string"/>
<xs:element name="dob" sql:field="DateOfBirth" type="xs:date" sql:datatype="datetime"/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name="facts" sql:is-constant="true">
<xs:complexType>
<xs:sequence>
<xs:element name="fact" sql:relation="PregnancyFacts" sql:relationship="PPF">
<xs:complexType>
<xs:sequence>
<xs:element name="factDate" sql:field="FactDate" type="xs:date" sql:datatype="datetime"/>
<xs:element name="factName" sql:field="FactName" type="xs:string"/>
<xs:element name="factValue" sql:field="FactValue" type="xs:string"/>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
I use a simple script to run the bulkload:
set objBL = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad.4.0")
objBL.ConnectionString = "provider=SQLOLEDB;data source=localhost;database=tempdb;integrated security=SSPI"
objBL.KeepIdentity = false
objBL.ErrorLogFile = "error.xml"
objBL.Execute "schema.xml","data.xml"
set objBL=Nothing
Here is the data that got inserted into the tables:
FirstName LastName DateOfBirth PregnancyID
Jane Doe 1985-07-22 00:00:00.000 1
Mary Smith 1985-06-12 00:00:00.000 2
(2 row(s) affected)
PregnancyID FactDate FactName FactValue FactID
1 2007-03-03 00:00:00.000 Eye Color Brown 1
1 2007-06-06 00:00:00.000 Hair Color Brown 2
2 2007-03-03 00:00:00.000 Eye Color Blue 3
2 2007-06-06 00:00:00.000 Hair Color Blonde 4
(4 row(s) affected)
As you see from the above example, the PregnancyID column values were propagated from the parent table Pregnancies to the child table PregnancyFacts.
I hope this info will shed some light to those who are trying to use this feature.
This question came up many times in the xml newsgroup and forum so I thought I could provide a simple solution to it.
Basically, the user has an XSD schema file and wants to create tables in a database that would correspond to the schema definition.In order to accomplish this, the user needs to annotate the schema file using the SQLXML annotations (see http://msdn2.microsoft.com/en-us/library/ms172649(SQL.90).aspx . By default, complexType elements map to tables and attributes and simpleType elements map to columns).
Bulkload's SchemaGen functionality allows the user to create and drop tables via an API setting.If SchemaGen property is set to TRUE, the tables identified in the schema will be created (the database must exist).If SGDropTables property is also set to TRUE, the tables will be deleted (if previously exist in the database) before they are re-created.
If no data needs to be uploaded (only tables generated), the Bulkload property should be set to FALSE.
Below is a small example on how this works.The data file is empty.
set objBL = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad.4.0")
objBL.ErrorLogFile = "error.xml"
objBL.ConnectionString = "provider=sqloledb;server=myserver;database=tempdb;Integrated Security=SSPI"
objBL.SchemaGen = true
objBL.SGDropTables = true
objBL.Bulkload = false
objBL.Execute "schema.xml","data.xml"
set objBL=Nothing
Here is the schema file content:
<?xml version="1.0" ?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:sql="urn:schemas-microsoft-com:mapping-schema">
<xs:element name="Product" sql:relation="ProductDescription">
<xs:complexType>
<xs:sequence>
<xs:element name="ProductID" type="xs:unsignedInt" sql:field="ProductID" />
<xs:element name="ProductName" type="xs:string" sql:field="ProductName" />
<xs:element name="Description" type="xs:string" sql:field="DescriptionPhraseID" />
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
The table that was created in tempdb database:
CREATE
TABLE [dbo].[ProductDescription]([ProductID] [int] NULL,
[ProductName] [nvarchar](1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
[DescriptionPhraseID] [nvarchar]
(1000) COLLATE SQL_Latin1_General_CP1_CI_AS NULL
)
ON [PRIMARY]
One thing to note here is that SchemaGen does not use XSD schema facets and extensions to generate the relational SQL Server schema.It only provides basic functionality and the user should modify the generated tables manually, if needed.
Summary:
Learn how to validate the XML data against the annotated XSD schema when using SQLXML Bulkload, with the help of System.Xml classes.
Introduction:
SQLXML Bulkload functionality available in Microsoft SQL Server 2005 allows the user to shred large amounts of xml data into the database using an annotated xsd schema. There are some known limitations regarding the XML data consumptions and this article will try to address one of them and will offer a solution to overcome it.
This article covers the following:
· Validating XML data with an XSD schema
· Using Bulkload in a .NET application
The problem:
In general, all element and attribute data which is contained in the body of an XML document is consumed and mapped to corresponding tables and columns and stored in the database.
However, one of the limitations of Bulkload is that the data validation is not performed during the bulk loading process. Here are some facts about Bulkload that need to be understood:
1. Bulkload doesn’t ensure that the XML data is valid with respect to any DTD that is defined or referenced within the xml data. It will only check if it is well-formed. Any error of this kind will be reported and no data will be inserted.
2. Bulkload does not make any attempt to validate the data with respect to the supplied annotated xsd schema.
Considering the above statements, if there is any mismatch between the data file and the schema, the bulkload operation will result in no data being inserted in the database and no error will be reported. Thus, users found it very hard to identify what the problem is.
Usually this problem happens due to one of the following:
1. The schema file contains the correct mappings but the data file has a small difference (like different casing for element/attribute name).
Example:
Schema has the following:
<xsd:attribute name=”CustomerID” type=”xsd:string”/>
Data has the follwing:
<Customers Customerid=”ALFKI”/>
Notice the difference marked in red.
2. There are elements/attributes in the data that are not defined in the schema.
To overcome this problem, the user can validate the xml data against the annotated xsd schema before doing the bulk load operation. This can be done using the XmlReader and XmlReaderSettings classes from System.Xml namespace.
Here is an example on how to do this, using bulkload functionality in a .NET application:
using System;
using System.Collections.Generic;
using System.Text;
using System.Runtime.InteropServices;
using System.Xml;
using System.Xml.Schema;
using System.IO;
using SQLXMLBULKLOADLib;
namespace Test
{
class Program
{
public static bool bValid = true;
[STAThread]
static void Main(string[] args)
{
SQLXMLBulkLoad4Class objBL = new SQLXMLBULKLOADLib.SQLXMLBulkLoad4Class();
objBL.ConnectionString = "provider=sqloledb;server=server;database=database;integrated security=SSPI";
objBL.ErrorLogFile = "error.xml";
objBL.SchemaGen = true;
if (IsValid("data.xml", "schema.xml"))
{
try
{
objBL.Execute("schema.xml", "data.xml");
}
catch (Exception e)
{
Console.WriteLine(e.ToString());
}
}
else
{
Console.WriteLine("XML data is not valid according to the schema definition!");
}
}
public static void MyValidationCallBack(object sender, ValidationEventArgs args)
{
bValid = false;
Console.WriteLine("The following error occured : " + args.Message);
}
public static bool IsValid(string strXML, string strXSD)
{
//Validate the xml data against the xsd schema
XmlReader reader = null;
XmlReaderSettings settings = null;
try
{
settings = new XmlReaderSettings();
settings.Schemas.Add("", "schema.xml");
settings.ValidationEventHandler += new ValidationEventHandler(MyValidationCallBack);
settings.ValidationType = ValidationType.Schema;
settings.ValidationFlags = XmlSchemaValidationFlags.None;
settings.ConformanceLevel = ConformanceLevel.Document;
reader = XmlReader.Create("data.xml", settings);
while (reader.Read()) { }
}
finally
{
reader.Close();
}
return bValid;
}
}
}
Conclusion:
This article and the included code sample present a solution to a very common problem customers encounter when using SQLXML Bulkload feature. For more details on Bulkload functionality, please visit the following links:
SQL Server Bulk Load Object Model:
http://msdn.microsoft.com/library/en-us/sqlxml3/htm/bulkload_3g30.asp
Guidelines and Limitations of XML Bulk Load:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/sqlxml3/htm/bulkload_5l44.asp
Examples of Bulk Loading XML Documents:
http://msdn.microsoft.com/library/en-us/sqlxml3/htm/bulkload_6bos.asp
Applies to:
Microsoft® SQL Server™ 2005
Microsoft Visual Studio® .NET