Welcome to MSDN Blogs Sign in | Join | Help

Galex's Semi-Structured Blog

SQL and XML in SQL Server 2005
Pattern facets and round trips.

If you are defining a schema that has elements or attribute of numeric types and require more restrictions on their value, try to avoid using pattern facets. If you do, you will get the warning:

Warning: Type (name of your type) is restricted by a facet 'pattern'
that may impede full round-tripping of instances of this type.

You may end up doing this because you want to specify that the number start with the digit 1 and end with the digit 9. Of course, this isn't a common scenario and you are more likely to want a certain number of digits or a minimum or maximum value. In those case, use the minInclusive, minExclusive, maxInclusive and maxInclusive facets.

xs:anySimpleType

One interesting simple type that can be the cause of a number of static type issues is xs:anySimpleType. This type is the base type of all simple types. That includes list and union types. By default, attributes are typed xs:anySimpleType, so if you do not specify a type via the type attribute, any simple type value is valid for that attribute. This includes list type values. You can also have element content typed to xs:anySimpleType. In that case, you can optionally specify the specific simple type that a specific element instance contains.

Because xs:anySimpleType includes list types, you may end up with static typing errors if you do not specify a type for an attribute and are querying for the attribute in a typed xml column. The problem has to do with cardinality. The error message that you get can be confusing and the typical way to correct such a static typing error does not work in this case. Here is an example:

create xml schema collection att as N'
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="mult">
        <xs:complexType>
            <xs:sequence>
                <xs:element type="xs:string" 
name="elem" maxOccurs="unbounded"/> </xs:sequence> <xs:attribute name="att"/> </xs:complexType> </xs:element> </xs:schema>' go declare @x xml(att) set @x = 'string' select @x.query('/mult/elem cast as xs:string?') go declare @x xml(att) set @x = 'string' select @x.query('/mult/@att cast as xs:string?')

These two queries will result in the follow error messages:

Msg 2365, Level 16, State 1, Line 3
XQuery [query()]: Cannot explicitly convert 
from 'xs:string *' to 'xs:string ?' Msg 2365, Level 16, State 1, Line 4 XQuery [query()]: Cannot explicitly convert
from 'xdt:anyAtomicType *' to 'xs:string ?'

The first query is easy to fix. Because there can be more than one elem element under the mult element, you must specify which chlid element you really want. We change the query to be (/mult/elem)[1] cast as xs:string?. So why do we see the same error with the second query? There can only be one @att element under the mult element, so why is the type that is inferred xdt:anyAtomicType*? If we try to fix the second query using the same workaround as the first, we still get the same error message! Even though you don't see it, the attribute is being implicitly atomized because of the cast operator. Because xs:anySimpleType includes list types, the atomization results in potentially zero or more atomic values. Since the query compiler can not possible know up front what the type is for the atomic values, we infer xdt:anyAtomicType.

The workaround is to explicitly atomize the attribute and place a positional predicate or more simply, supply an atomic type for the attribute. To atomize the attribute you can change the query as such:

declare @x xml(att)
set @x = '<mult att="foo"><elem>string</elem></mult>'
select @x.query('data(/mult/@att)[1] cast as xs:string?')

Note that the work around is also applicable for elements and attributes that are list types.

Content models and complex types

Today, I want to a little about how content models are defined in complex types. These are important concepts in not only creating schemas and instances that are valid against the schemas, but will also help in understanding static typing and static typing errors within XQuery (this will be topic in a couple of posts).

The important thing to understand about complex types is that they define content models that contain elements and attributes. The structure and potentially the order of elements within a complex type is the content model. XSD allows you group your content that specifies if all the elements are required and should appear in order in a valid instance. These content groups and elements within these content groups can be specified as optional and may even repeat an arbitrary number of times.

A sequence group specifies that the content should all be present and in order. A choice group specifies that one item in the group should be present. An all group specifies that all the content should be present, but in any order. Sequences and choice groups can be nested within each other. An all content group can only be at the top of a complex content model.

Here is an example:

<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
   <
xs:complexType name="foo"
>
      <
xs:sequence maxOccurs="unbounded"
>
         <
xs:element name="foo" type="xs:string"
/>
         <
xs:element name="bar" type="xs:int" minOccurs="0"
/>
         <
xs:choice minOccurs="0"
>
            <
xs:element name="baz" type="xs:double"
/>
            <
xs:sequence
>
               <
xs:element name="foo2" type="xs:dateTime"
/>
               <
xs:element name="bar2" type="xs:decimal"
/>
            </
xs:sequence
>
         </
xs:choice
>
      </
xs:sequence
>
   </
xs:complexType
>

   <xs:complexType name="bar">
      <
xs:all
>
         <
xs:element name="foo" type="xs:string"
/>
         <
xs:element name="bar" type="xs:string" minOccurs="0"
/>
      </
xs:all
>
   </
xs:complexType
>
</
xs:schema
>

The complex type "foo" is defined as a sequence that has an element foo followed by bar and finally a choice between an element baz and another sequence containing foo2 and bar2. And, the complex type "bar" is defined an all content that requires elements foo and bar in any order.

Notice that I can specify minOccurs and maxOccurs attributes on model groups and elements. For the complex type "foo", the sequence can be repeat an arbitrary number of times (maxOccurs="unbounded". And the element bar within this sequence does not have to appear at all within one instance of the sequence (minOccurs="0"). The same goes for the choice model group.

Union and Lists

Last time we talked about built-in simple types and how additional types can be defined as restrictions of those types. Now, we will talk about the remaining kinds of simple types: unions and lists.

According to the W3C spec, list types are data types where the legal values are finite-length sequence of values. Each list type can be a list of a specific itemType. Within SQL Server 2005, we do not permit the creation of list of lists, list of union types and list of xs:ID. List types can be derived from the built-in atomic types, user defined restricted atomic types or anonymous types. The lexical representation of instances of list types are space delimited sequences of instances of the itemType. That means itemTypes that require spaces will not validate correctly when used within a list. For example, if I had a simple type that was a enumeration of three possible values "foo bar", "foo baz" and "bar baz", I would not be able to define a list of this type. When the validator runs across the instance "foo bar foo baz bar baz", it will try to validate "foo", "bar", "foo", "baz", "bar" and "baz" as instances of the itemType; all of which will fail. List types are XSD's intrinsic collection type. Although other mechanisms in XSD can be used to form collections of items, the list type may be more suitable for your needs. In additional to being able to create list types from atomic types, new types can derived from list types that restrict the list further by length, enumeration and pattern facets.

Union types are defined as types where the value and lexical space is the union of the value and lexical spaces of one or more simple types. You could create a union type that was the union of xs:duration and xs:time. During validation of an instance of this type, the lexical value will be validated against xs:duration first and if it doesn't pass validation, it will be validated against xs:time. However, if you want to have an instance validate against a specific memberType, you can specify that within the instance element via the xsi:type attribute. XSD unions are similar to C unions but have subtle differences. For one thing, the actual type of the value is not defined by a type descriptor. During validation, the PSVI is annotated with the actual type of the value.  

CREATE XML SCHEMA COLLECTION foo AS N'
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:tn="urn:test" targetNamespace="urn:test">
 <xs:simpleType name="myAtomicType">
  <xs:restriction base="xs:string">
   <xs:pattern value="(ab)*"/>
  </xs:restriction>
 </xs:simpleType>

 <xs:simpleType name="myList">
  <xs:list itemType="tn:myAtomicType"/>
 </xs:simpleType>

 <xs:simpleType name="myAnonymousList">
  <xs:list>
   <xs:simpleType>
    <xs:restriction base="xs:int">
     <xs:totalDigits value="8"/>
    </xs:restriction>
   </xs:simpleType>
  </xs:list>
 </xs:simpleType>

 <xs:simpleType name="myUnion">
  <xs:union memberTypes="tn:myAnonymousList tn:myAtomicType xs:float"/>
 </xs:simpleType>
</xs:schema>'

I've created two list types, one that uses a type I've created, another using an anonymous type defined only within its scope. I've also created a union type that is a union of a list type, a restricted type and a builtin type.

Simple Types

My fellow team members have all blogged on static typing and anyone reading these blogs would be correct to conjecture that it is pretty important. There is so much to say about it and I'd rather not duplicate what my team members have written. For an introduction, read Denis' entry here and Mike's entry here. After reading these posts, it should be apparent that the point of static typing is to aid the developer in writing correct queries. Static typing is often used in other languages such as C# which are strongly typed. XQuery is a strongly typed language and implementations have the option of providing static type checking at query compile time. The XQuery implementation in SQL Server 2005 does in fact perform static type checking.

Before going into static typing, it is important to understand types within XML: where they come from, how they are defined and what can they "look" like. I will spend the next few posts talking just about types.

The type system in XQuery is based on XML Schema (XSD). If you don't plan on using XSD validation in your XML, you might think there is no point in learning more about types. However, static typing is still in affect when quering untyped XML within XQuery. The difference between typed and untyped instances is that the former is validated against an XSD collection. I won't go into the details of XSD, but I suggest those who are not familiar with it read Part 0 of the XSD specification.

Part 2 of the XSD specification talks about primitive and builtin types. XQuery inherits these builtin types into its own type system. Along with simple and complex types from XSD, XQuery defines node types. XQuery also has the notion of a sequence of items. Items are either atomic values or nodes. (Don't worry, these will become clear later)

Atomic types within XQuery are the builtin primitive types (simple types) and the types derived from these primitives by restriction or union. XQuery defines a few extra simple types: xdt:untypedAtomic, xdt:anyAtomicType, xdt:yearMonthDuration and xdt:dayTimeDuration (SQL Server 2005 does not support the last two types). We will discuss these types in the future. Like XSD, the builtin types just exist and the user has the ability to introduce new types. In SQL Server 2005 we do this by creating an XSD schema collection. To create new atomic simple types, we have to restrict an existing atomic simple type. For example, if I wanted to create a type named foo that is an integer greater than 10, I would execute this DDL:

CREATE XML SCHEMA COLLECTION new_type_collection AS N'
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:tn="urn:new_type" targetNamespace="urn:new_type">
    <xs:simpleType name="foo">
        <xs:restriction base="xs:integer">
            <xs:minExclusive value="10"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:element name="fooElement" type="tn:foo"/>
</xs:schema>'

I've also defined an element name fooElement. Assuming I have an XML instance, I can now use this type within a query.

declare @x xml(new_type_collection)
set @x = '<tn:fooElement xmlns:tn="urn:new_type">11</tn:fooElement>'
select @x.query('declare namespace tn="urn:new_type"; tn:foo("12") > /tn:fooElement')

Notice that I created an instance of the type foo by using a constructor syntax. What I have done is created an instance with the numeric value of 12 and then compare it to the value stored within the instance I am querying. In this case, the result is true. I can also create instances of builtin types with the constructor syntax:

declare @x xml(new_type_collection)
set @x = '<tn:fooElement xmlns:tn="urn:new_type">11</tn:fooElement>'
select @x.query('declare namespace tn="urn:new_type"; xs:integer("0") > /tn:fooElement')

Here, I've created an instance of xs:integer with the value 0 and compared it to the stored instance.

Okay, so now you know what builtin types are, how to create new restricted types from them via an XML Schema and how to create instances of these simple types within a query. The builtin simple types consist of string, date/time, floating point, decimal, integer, binary and various other types. To learn more about them, their semantics, restrictions and valid value and lexical spaces, read Part 2 of the XSD Specification. If you prefer to read a book try Definitive XML Schema by Priscilla Walmsley. Here is a diagram of the builtin type hierarchy.

Next time, I will talk about union and list types and their properties within XQuery.

The XML Team is blogging!

Welcome to my MSDN blog! My name is Galex Yen and I've been working within the XML datatype team in SQL Server for the past year. My team has begun to blog about our exciting new feature and it seems I'm one of the late comers. I've linked to my fellow team members' blogs under the category "Other MSDN XML Blogs." Our hope is to provide a one stop shop for dialogue and education on the native XML datatype.

I'm very enthused about how we've integrated XML into SQL Server 2005 and I hope that all of you will learn something from our posts. I encourage you to ask questions and provide feedback.

Page view tracker