SPal's WebLog

  • MSDN presentation on "Introducing XML in SQL Server 2005"

    First of all, I'd like to thank all those who took the time to attend my presentation today (Nov 8, 2005). I hope you found it useful. The launch of SQL Server 2005, Visual Studio 2005 and BizTalk Server 2006 is a matter of great pride and honor for us all. 

     

    The website with links to the XML schemas published by SQL Server is http://schemas.microsoft.com/sqlserver. Look at the one called SqlTypes for the XML schema types corresponding to the built-in SQL types. This was one of the questions during the presentation.

     

    The Transact-SQL script I presented is below. I hope you get a chance to run it! Play with it to discover the power of the native XML support. For more information about the XML support, check out the REFERENCES in the left pane.

     

    Enjoy! Let me know if you have questions or would like me to address any issues in the blog.

     

    Shankar

    Program Manager

    Microsoft SQL Server

     

    This posting is provided "AS IS" with no warranties, and confers no rights.

    Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm

     

    ----------------------------------------------------------------------- 

    -- This T-SQL script illustrates the use of typed (i.e. XML schema-bound) XML data type on SQL Server 2005.

     

    ----------------------------------------------------

    --- Cleanup

    ----------------------------------------------------

    USE         master

    GO

     

    --- Drop the existing database

    if    EXISTS (SELECT * FROM sys.databases WHERE name = 'MSDN')

          DROP  DATABASE MSDN

    GO

     

    ---------------------------

    --- Initialization

    ---------------------------

    --- Create a new database called MSDN

    CREATE      DATABASE MSDN;

    GO

     

    USE MSDN;

    GO

     

    -------------------------------------------------

    ---- Drop the meeting notes XML schema collection

    IF    EXISTS (select name from sys.xml_schema_collections where  name='meetingNotesCollection')

        DROP  XML SCHEMA COLLECTION meetingNotesCollection;

    go

     

    CREATE XML SCHEMA COLLECTION meetingNotesCollection

    AS

    '<xsd:schema

          xmlns:xsd="http://www.w3.org/2001/XMLSchema"

          xmlns="http://www.microsoft.com/meetingNotes"

          targetNamespace="http://www.microsoft.com/meetingNotes"

          elementFormDefault="qualified">

      <xsd:element name="meetingNotes" type="meetingNotesType"/>

      <xsd:complexType name="meetingNotesType" mixed="true">

          <xsd:sequence>

            <xsd:element name="title" type="xsd:string"/>

            <xsd:element name="meetingDate" type="xsd:date"/>

            <xsd:element name="participants"

                      type="participantType" />

            <xsd:element name="notes" type="noteType" />

            <xsd:element name="actionItem" type="actionItemType"/>

          </xsd:sequence>

      </xsd:complexType>

     

      <xsd:complexType name="participantType" mixed="true">

          <xsd:sequence>

            <xsd:element name="participant" type="xsd:string"

                minOccurs="1" maxOccurs="unbounded" />

          </xsd:sequence>

      </xsd:complexType>

     

      <xsd:complexType name="noteType" mixed="true">

          <xsd:sequence>

            <xsd:element name="note" type="xsd:string"

                      minOccurs="0" maxOccurs="unbounded"/>

          </xsd:sequence>

      </xsd:complexType>

     

      <xsd:complexType name="actionItemType" mixed="true">

          <xsd:sequence>

            <xsd:element name="description" type="xsd:string"

                      minOccurs="0" maxOccurs="unbounded"/>

          </xsd:sequence>

      </xsd:complexType>

     

    </xsd:schema>';

    GO

     

    --- Create a table to store MeetingNotes

    CREATE TABLE meetingData (

          id           int PRIMARY KEY,

          meetingNotes XML (meetingNotesCollection))

    GO

     

    -------------------------------

    --- Insertion of data

    -------------------------------

    --- Insert a literal value for XML

    INSERT INTO meetingData VALUES (1,

    '<p:meetingNotes

           xmlns:p="http://www.microsoft.com/meetingNotes">

          <p:title>Discussion of XML presentation</p:title>

          <p:meetingDate>2005-10-03Z</p:meetingDate>

          <p:participants>

                <p:participant>Shankar</p:participant>

                <p:participant>Brandon</p:participant>

          </p:participants>

          <p:notes>

                <p:note>Equal amount of slides and demos</p:note>

                <p:note>VS-focused slides</p:note>

                <p:note>SQL/CLR project for XML processing</p:note>

          </p:notes>

          <p:actionItem>

                <p:description>Work on slides - Shankar</p:description>

                <p:description>Work on demos - Shankar</p:description>

          </p:actionItem>

    </p:meetingNotes>');

    GO

     

    --- Insert more data into the table

    INSERT INTO meetingData VALUES (2,

    '<meetingNotes xmlns="http://www.microsoft.com/meetingNotes"><title>Discussion of CDP and Office11</title><meetingDate>2005-09-29Z</meetingDate><participants><participant>Anilnori</participant><participant>Soumitras</participant>   <participant>shankarp</participant></participants><notes><note>Format of Word document and storage</note></notes><actionItem><description>Need to put all ideas into a document</description></actionItem></meetingNotes>');

     

    INSERT INTO meetingData VALUES (3,

    '<meetingNotes xmlns="http://www.microsoft.com/meetingNotes"><title>Discussion of TechReady presentation</title><meetingDate>2005-08-02Z</meetingDate><participants><participant>shankarp</participant><participant>heraino</participant></participants><notes> <note>Outline the features</note><note>Show architecture diagram</note></notes><actionItem><description>Update the presentation - shankarp</description></actionItem></meetingNotes>');

    GO

     

    --- Table contents

    SELECT  * FROM    meetingData;

    GO

     

    --- Index the XML column in the table

    CREATE      PRIMARY XML INDEX idx_xml ON meetingData(meetingNotes);

    GO

     

    ---- Create PATH secondary XML index

    CREATE      XML INDEX idx_xml_PATH

          ON meetingData(meetingNotes)

          USING XML INDEX idx_xml

          FOR PATH;

    GO

     

    --------------------------

    --- QUERYING XML DATA TYPE

    --------------------------

    --- Find the action items

    SELECT      meetingNotes.query ('

          declare default element namespace "http://www.microsoft.com/meetingNotes";

          /meetingNotes/actionItem/description')

    FROM  meetingData;

    GO

     

    --- Find the action items, one per row

    --- With on-the-fly element construction

    SELECT  dref.query ('<workitem>{data(.)}</workitem>')

    FROM    meetingData CROSS APPLY

                meetingNotes.nodes ('

                      declare default element namespace "http://www.microsoft.com/meetingNotes"; 

                      /meetingNotes/actionItem/description') R(dref);

    GO

     

    --- Find the action items, one per row

    --- With no element tags

    SELECT dref.value ('.', 'nvarchar(256)')

    FROM  meetingData CROSS APPLY

                meetingNotes.nodes ('

                      declare default element namespace "http://www.microsoft.com/meetingNotes"; 

                      /meetingNotes/actionItem/description') R(dref);

    GO

     

    --- Find meetings with at least 3 participants

    SELECT      *

    FROM  meetingData

    WHERE meetingNotes.exist ('

          declare default element namespace "http://www.microsoft.com/meetingNotes";

          /meetingNotes[count (//participant) > 2]') = 1;

    GO

     

    -------------------------------------------

    --- Promote meeting #participants property

    -------------------------------------------

    --- Create user-defined function for #participants

    CREATE      FUNCTION udf_get_pCount (

          @xData xml(dbo.meetingNotesCollection))

    RETURNS     int

    WITH SCHEMABINDING

    BEGIN

          RETURN (

          SELECT      @xData.value ('

                declare default element namespace "http://www.microsoft.com/meetingNotes";

                count(//participant)', 'int'))

    END

    GO

     

    ----- Add a computed column to the table for meetingDate

    ALTER TABLE meetingData

          ADD PCount AS dbo.udf_get_pCount(meetingNotes);

    GO

     

    ----- Check content of table

    SELECT * FROM meetingData;

    GO

     

    --- Index the computed column

    CREATE      NONCLUSTERED INDEX idx_pCount

          ON meetingData(pCount);

    GO

     

    -- Simpler alternative to the previous XQuery:

    SELECT *

    FROM   meetingData

    WHERE  PCount > 2;

    GO

     

    ------------------------------

    --- XML DATA TYPE MODIFICATION

    ------------------------------

    --- Add a new participant to a meeting

    UPDATE      meetingData

    SET         meetingNotes.modify ('

          declare default element namespace "http://www.microsoft.com/meetingNotes";

          insert      <participant>Jinghao</participant>

          into (/meetingNotes/participants)[1]')

    WHERE id = 1;

    GO

     

    --- Verify update

    SELECT  * FROM    meetingData;

    GO

     

    --- Modify the title of a meeting

    UPDATE      meetingData

    SET         meetingNotes.modify ('

          declare default element namespace "http://www.microsoft.com/meetingNotes";

          replace value of (/meetingNotes/title)[1]

          with "Discussion of XML presentation for MSDN"')

    WHERE id = 1;

    GO

     

    --- Verify update

    SELECT * FROM  meetingData

    GO

  • XQuery implementation in SQL Server 2005

    I made at presentation on September 1st on our XQuery implementation at the International Conference on Very Large Databases at Trondheim, Norway. The paper contains a lot of useful information, which, I hope, you will enjoy. The link is: http://www.vldb2005.org/program/paper/thu/p1175-pal.pdf.

    I'm adding this link to the left pane as well for future reference.

    I hope you like the content.

    Thank you,

    Shankar

  • Rebuild XML indexes when upgrading from Beta 2 to June CTP of SQL Server 2005

    Setup upgrade to June CTP of SQL Server 2005 is unavailable. Typically, you would uninstall the existing build of SQL Server 2005, install the June CTP, and attach your database files to the server.

     

    After Beta 2 of SQL Server 2005, we have made two changes that impact XML indexes. First, string comparisons on XML data type use a binary collation. Secondly, a bug in the key fields of the secondary XML indexes has been fixed. If you have created XML indexes using the Beta 2 build, you will find the XML indexes are disabled after the upgrade. If you have created the XML indexes using post Beta 2 builds, or already upgraded to a post Beta 2 build and rebuilt your XML indexes, you should not face this issue.

     

    To rebuild your XML indexes, use the following steps:

     

    1.       Check whether your XML indexes are disabled after the upgrade with the query below:

     

    SELECT  count(*)

    FROM    sys.xml_indexes

    WHERE   is_disabled != 0;

     

    If the query returns 0, then your XML indexes are not disabled and you do not need to do anything.

    2.       Otherwise, enable (rebuild) the primary and secondary XML indexes. The primary XML indexes must be rebuilt before the secondary XML indexes

    a.       Determine the primary XML indexes in a database:

    SELECT  sxi.name "Index name", so.name "Table name"

    FROM    sys.xml_indexes sxi JOIN sys.objects so

            ON (so.object_id = sxi.object_id)

    WHERE   using_xml_index_id IS NULL;

     

    b.       Rebuild the primary XML indexes:

    ALTER INDEX <primary_xml_index> ON <table_name>

    REBUILD;

     

    c.       Determine the secondary XML indexes in a database:

    SELECT  sxi.name "Index name", so.name "Table name"

    FROM    sys.xml_indexes sxi JOIN sys.objects so

            ON (so.object_id = sxi.object_id)

    WHERE   using_xml_index_id IS NOT NULL;

     

    d.       Then rebuild the secondary XML indexes:

    ALTER INDEX <secondary_xml_index> ON <table_name>

    REBUILD;

     

    A few additional notes:

     

    ·         The index options specified during the creation of these XML indexes will be preserved when you use ALTER INDEX … REBUILD.

    ·         Ana (tester) and Mark (developer) pointed out that a faster mechanism is to drop the XML indexes first and then to recreate them. This also requires less disk space, even compared to CREATE XML INDEX … WITH DROP_EXISTING. Do the following:

    a.       Make a note of the index options using

    SELECT * FROM sys.xml_indexes

     

    b.      Drop the primary XML indexes – this will drop the secondary XML indexes as well

    DROP INDEX <primary_xml_index> ON <table_name>

     

    c.       Recreate the primary and secondary XML indexes with the desired index options.

    ·         Using ALTER INDEX ALL ON <table_name> REBUILD could be used – it figures out the order in which the primary XML indexes must be rebuilt. However, it rebuilds all the indexes on <table_name> – both XML and non-XML ones – and will take longer.

    ·         If you want to use SQL Server Management Studio to rebuild the XML indexes, see the topic “How to: Rebuild an Index (SQL Server Management Studio)” in SQL Server 2005 Books Online.

     

    I hope this information helps you with Beta 2 to June CTP upgrade.

     

    Thank you,

     

    Shankar

    Program Manager

    Microsoft SQL Server

     

    This posting is provided "AS IS" with no warranties, and confers no rights.

    Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm

  • MSDN whitepapers for native XML support in SQL Server 2005

    SQL Server 2005 June CTP was released earlier this month. It contains many improvements and is a very important milestone on the way to RTM. If you haven’t upgraded yet, please do so! For the native XML support, I’d like to point out the following. 

     

    MSDN whitepapers:

    We’ve updated the MSDN whitepapers and added new ones. Here are the links:

     

    XML feature overview: http://msdn.microsoft.com/xml/default.aspx?pull=/library/en-us/dnsql90/html/sql2k5xml.asp

    XML Best Practices: http://msdn.microsoft.com/xml/default.aspx?pull=/library/en-us/dnsql90/html/sql25xmlbp.asp

    XML Optimizations: http://msdn.microsoft.com/xml/default.aspx?pull=/library/en-us/dnsql90/html/sqloptxml.asp

    XML Generation from tables: http://msdn.microsoft.com/xml/buildingxml/xmlanddatabase/default.aspx?pull=/library/en-us/dnsql90/html/forxml2k5.asp

    XML Options in Microsoft SQL Server 2005: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql2k5xmloptions.asp

    Introduction to XQuery in SQL Server 2005: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql2k5_xqueryintro.asp

     

    The XML optimizations and XML best practices whitepapers in particular have much more content – some of the new optimizations are very cool! Check these out by following the appropriate links.

     

    When you read these whitepapers, please take a moment to rate their content and their presentation.

     

    Enjoy!

     

    Thank you,

     

    Shankar

    Program Manager

    Microsoft SQL Server

     

    This posting is provided "AS IS" with no warranties, and confers no rights.

  • Eliminating Multiple Evaluations of XML Data Type Methods

    I have seen a few scenarios in which a value is extracted from an XML data type instance in SQL Server 2005. The query performs some checks on the value such as its type (e.g. whether it is numeric) or compares it with a constant, and returns a value based on this check. The value extraction is done using the value() method of XML data type, and the method is re-evaluated in the SELECT list. I want to share a simple tip for better performance. Consider the example

    DECLARE @x XML
    SET   @x = '<book ISBN="0-7356-1588-2">
                      <title>Writing Secure Code</title>
                      <subject/>
                      <price>39.99</price>
                </book>'

    The price of the book can be retrieved using the following query as nvarchar(32) if it is not known for sure that it is actually a numeric value in all <book> instances:

    SELECT @x.value ('(/book/price)[1]', 'nvarchar(32)') 

    Suppose the query then wants to convert it to a decimal value if it is numeric and return 0 if it is not. The full query then becomes:

    SELECT case isnumeric(@x.value ('(/book/price)[1]', 'nvarchar(32)'))
           when 1 then @x.value ('(/book/price)[1]', 'decimal(5,2)')
                  else 0
           end

    This query computes the value() method twice. A simple trick avoids the recomputation by assigning the result of @x.value ('/book/price', 'nvarchar(32)') to a SQL variable of type nvarchar(32) and converting the variable to decimal:

    DECLARE @v nvarchar(32)
    SET   @v = (SELECT @x.value ('(/book/price)[1]', 'nvarchar(32)'))
    SELECT case isnumeric(@v)
           when 1 then CAST (@v AS decimal(5,2))
                  else 0
           end

    The rewrite performs significantly faster than the earlier query since it manually optimizes the query to reuse computed values. It also simplifies the query plan quite a bit. There is an underlying assumption on its benefit– the majority of the <price> values are numeric, otherwise the workload does not perform much faster.

    Often, there are many more values you want to extract for the XML instance, and breaking up the query into a separate one for each variable assignment is inconvenient. The query may also involve the nodes() method, so that breaking up the query increases the number of nodes() method invocations and requires table-valued variables - the benefits are then lost. A subquery comes to our rescue! Look at the following rewrite in which the value() method is computed in a subquery and aliased as T(Price) and reused in the outer SELECT:

    SELECT case isnumeric(Price)
           when 1 then CAST (Price AS decimal(5,2))
                  else 0
           end 
    FROM  (SELECT @x.value ('(/book/price)[1]', 'nvarchar(32)') Price) T

    You get the same benefits as before, ignoring the difference between the variable assignment and the subquery costs, which are much smaller than the cost of the value() method computation. The same optimization can be used in other places as well:

    SELECT NULLIF (Subj, '')
    FROM  (SELECT @x.value ('(/book/subject)[1]', 'nvarchar(64)') Subj) T

     
    The value() method returns the empty string '' for subject, the NULLIF compares the returned value with its second argument '' and returns NULL since these two values are equal. Replacing Subj with the value() method in NULLIF (Subj, ''), as shown below, computes the value() method twice when the return value is other than the empty string. This double computation is much slower:

    SELECT NULLIF (@x.value ('(/book/subject)[1]', 'nvarchar(64)'), '')

    You get the idea. We can put the two queries together as follows:

    SELECT case isnumeric(Price)
                when 1 then CAST (Price AS decimal(5,2))
                  else 0
                end,
           NULLIF (Subj, '')
    FROM  (SELECT @x.value ('(/book/price)[1]', 'nvarchar(32)') Price,
                  @x.value ('(/book/subject)[1]', 'nvarchar(64)') Subj 
           ) T

    You might wonder – why doesn’t the query optimizer do this trick? It certainly is an option and hey – if we did it all in one release, what would we do in the future? ;-)

    Best wishes and thank you,

    Shankar
    Program Manager
    Microsoft SQL Server

    This posting is provided "AS IS" with no warranties, and confers no rights.

    Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm


     

  • SQL Script for Typed XML used in MSDN Webcast on Dec 7, 2004

    Hello!

     

    This posting contains T-SQL script for working with typed (i.e. XML schema bound) XML. I used it during the MSDN webcast on Tue, Dec 7, 2004, on “Introducing XML in SQL Server 2005”. Yesterday’s posting (http://weblogs.asp.net/spal/archive/2004/12/20/327635.aspx) contained the script for untyped XML. Thanks for your patience!

     

    I have added some more annotation to the script to make it more readable – so please don’t skip the comments. I have the habit of repeating key phrases – this makes the text a little verbose but avoids ambiguity.

     

    Here is a list of the MSDN whitepapers on XML support in SQL Server 2005 that you might find useful:

    Of course, SQL Server 2005 books-online is an invaluable resource.

     

    I hope these scripts help you get started with the new XML features.

     

    Thanks and enjoy!

     

    Shankar

    Program Manager

    Microsoft SQL Server

     

    -----------------------------------------------------------------------

    -- First, the formality:

    -- Use of included script samples are subject to the terms specified at

    -- http://www.microsoft.com/info/cpyright.htm

    -----------------------------------------------------------------------

     

    -- This T-SQL script illustrates the use of typed (i.e. XML schema-bound) XML data type on SQL Server 2005.

     

    ---------------------------------------------------------------------------------------------------------

    -- NOTE: Before we get started, save the following data literally into a file C:\temp\TypedPropTable.txt.

    -- If you ch