Stan Kitsis

XML Tools and all things schema in system.xml and MSXML

  • What's new in MSXML 6.0: Schema-related changes

    In this post I will outline the major schema-related changes in MSXML 6.0 and what you might need to do if you rely on the old behavior.  Questions? Comments? Let me know.

    XDR Support

    MSXML6 has removed support for XDR schemas.  XML schema (XSD) 1.0 has now been a W3C recommendation for almost 4 years so we made the decision to discontinue support for proprietary XDR schemas in MSXML6.  XDR schemas will continue to be supported in earlier versions of MSXML. 

    • If you are using MSXML6, you will need to convert your XDR to XSD.

    Schema Compilation Changes

    1.     Support for Partial Schemas

    Summary: If two XSD files are loaded from different locations for the same namespace the Add method will union the declarations found in both locations.  In the past, adding a namespace from a secondary location would replace the definitions.

    Scenarios: Any scenario where the user is calling Add more than once on the same namespace may be affected.   Also, see SchemaCache::getSchema changes below.

    If you rely on old behavior: Create a new SchemaCache and add only the appropriate schema

    2.     Schema flattening

    Summary: We now “flatten” schema imports (xs:import) so every namespace referenced is a first class citizen in the schema cache.  This ensures there is only one unique definition for every schema type in the schema cache.

    Scenarios: Any scenario where a common namespace was imported from more than one location by two or more namespaces – in other words, any situation where there could be ambiguous definitions for a type in the SchemaCache depending on the namespace it is used from.  Also, SchemaCache.Length might have a different value.

    This is a breaking change and does not lend itself to the old behavior

    3.     Improved Support for Runtime Schemas

    Summary: Inline schemas and schemas referenced from an instance using xsi:SchemaLocation are now added to an XML instance-specific cache which wraps the user-supplied SchemaCache. 

    Scenarios: This enables some more complex scenarios where cross-references between runtime schemas are handled appropriately.

    This is a non-breaking change – it is enabling new scenarios

    4.     Lax Imports

    Summary: The schema cache will compile a schema that has a reference to any other type already in the schema cache regardless of whether there is an explicit import or not (this is like an import with no location).  In order to make the add order insignificant the validateOnParse flag must be set to false when schema is loaded.

    This is a non-breaking change

    SchemaCache API Changes

    1. SchemaCache::get – notImpl – this scenario was targeted primarily at XDR so it has been removed.
    2. SchemaCache::remove – notImpl – Previously the remove method removed a namespace and all of its imported namespaces, however now that imported schemas are promoted they may have multiple dependencies.  So this method is not implemented

    o       The simple workaround is just to create a new schema cache and add the desired schemas.

    1. SchemaCache::getSchema – If the namespace is loaded from one location there is no change to this API.  If the namespace is loaded from multiple locations an “encapsulating” schema is created that union all the declarations under a single generated schema tag.

    o       There is no workaround to get to the old behavior

    1. SchemaCache::addColllection – When one schema cache is added to another it supports partial schemas just like a call to Add( ).  addCollection is atomic, either all schemas in the cache can be added or else none are added.

    o       If you rely on old behavior, create a new SchemaCache and add appropriate schemas

    1. get_schemaLocations: used to return all the schema locations included/imported/redefined, now only reports the schema locations for the schema's namespace only, because of the same reason that schema has only elements/types etc in its own namespace.

    o       To get old behavior, you need to step through included/imported/redefined schemas manually and call get_schemaLocations for each one of them.  Note that if you hit schemas loaded from multiple files, you will get multiple schema locations, which is different from the old behavior.

  • Inline Schemas

    MSDN just published my article on inline schemas.  Check it out.

     

  • XML Editor in VS2005: Support for inline schemas

    The XML Editor provides support for Inline Schemas.  Inline schemas are XML schema definitions included inside XML instance documents.  They can be used to validate that the rest of the XML matches the schema constraints in the same way that external schema documents can be used.   Likewise, the syntax and semantics of inline schemas are the same as for external schemas. Inline schemas can be useful in a number of situations, including:

    ·          An architecture where internal DTDs were used and the developers wish to preserve that design pattern.

    ·          It is difficult to access external files or URLs, e.g. for security or platform reasons.

    ·          There is too much diversity in the set of schemas and instances that a system must process, so it is easiest to simply keep the schema as an integral part of the XML document.

     

    The following XML snippet contains an example of using an inline schema. 

     

    <?xml version="1.0" encoding="utf-8"?>

    <root xmlns:inl="http://inline">

     

      <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"

                 targetNamespace="http://inline"

                 xmlns="http://inline"

                 elementFormDefault="qualified"

                 attributeFormDefault="unqualified">

        <xs:element name="parent">

          <xs:complexType>

            <xs:sequence>

              <xs:element name="child" type="xs:string"/>

            </xs:sequence>

          </xs:complexType>

        </xs:element>

      </xs:schema>

     

      <inl:parent>

        <inl:child>text</inl:child>

      </inl:parent>

     

    </root>

     

    You can change the inline schema and the XML Editor will pick up those changes immediately and use the updated schema for validation and intellisense.  For example, if you change the <child> element’s type from xs:string to xs:int, you will get a validation error.

  • XML Editor in VS2005: Creating XML files using the VB Development settings

    If you are a VB developer and tried to use the new XML Editor, you might have noticed that the File menu does not have File | New option (this is only true if you selected VB Development settings).  So, how do you go about creating new XML files?  Lisa Feigenbaum, a Program Manager on the VB team, talks about two workarounds - she tells you how to create a shortcut to File | New and how to get it back on the menu.

     

  • MSXML 6.0 vs. System.Xml: Schema handling differences

    One of the goals we had for MSXML6 and system.xml in .NET 2.0 was to bring down the number of differences in schema processing between the two engines.  I think we did a pretty good job here, but there are still a few differences left.  The table below lists all known differences.  If you find something that is not listed, please let me know.

     

    In some cases both engines are compliant with the spec but have different behavior (when the spec is not specific).  In others, one engine is compliant (denoted by a "*" in the table below) while the other one isn't.  It is our goal to eliminate all the differences.  However, some of them are corner case scenarios and we might decide not to invest time and resources in fixing them.  If you really want to see a particular scenario fixed, let me know.

     

    Feature MSXML 6 system.xml
    Using XML attributes without explicit schema declaration not supported supported
    element of type xs:ID supported* not supported
    Conflicting values of facets minInclusive, maxInclusive in base and derived types not allowed* allowed
    maxOccurs=0 with no minOccurs error* no error
    Multiple redefines of the same schema document error warning (only first redefine is processed)
    parsing of targetNamespace attribute as anyURI supported* not supported
    Regex support for Unicode Character blocks Unicode 3.1 Unicode 4.0.1
    Regex patterns involving combinating a negative group with another using '|' not supported supported*
    XSD Errata for Regex E2-52 not supported supported*
    Regex character class subtraction not supported supported*
    Matching newlines/linebreaks from a xsd:pattern facet not supported supported*
    minOccurs/maxOccurs value limits upto 2^32 CLR decimal
    limit of totalDigits facet values upto 2^32 CLR decimal
    datatypes used to store length, minLength, maxLength facet values upto 2^32 CLR decimal
    a prohibited attribute in a complex type no error, attribute stripped* warning, attribute stripped*
    unreferenced groups in the schema always compiled compiled on reference
    default element values availability not supported supported*
    expose post schema validation infoset not supported supported*
    ENTITY in DTD resolvable by attribute/elements of type xs:ENTITY DOM=Yes, SAX=no Yes
    adding a default qualified attribute to an element in a document that has attribute's namespace as default DOM: apply qualified default attribute with namespace value as the prefix. SAX=does not generate prefix for default qualified attributes error
    identity constraints (key/unique) evaluation for skip/lax blocks node is null not able to find node
    element with mixed content, and a fixed schema value, containing child elements instance accepted instance not accepted*
    whitespace facet on anySimpleType preserve* collapsed
    XSD Errata on xs:base64Binary parsing supported* not supported
    xs:dateTime: number of year digits supported 10 4
    xs:dateTIme: number of fraction digits supported in seconds 9 7
    xs:dateTime: Range of hours in time zone "-14:00 to +14:00" * 99:00 to -99:00
    xs:dateTime: negative years supported* not supported
    "z" (as opposed to "Z") in xs:dateTime to represent UTC time not allowed* allowed
    xs:time: range of hour value 00:00:00 to 24:00:00* 00:00:00 to 23:59:59
    xs:gMonth: XSD Errata revised the lexical representation of gMonth only --MM is allowed* both --MM and --MM-- are allowed
    maximum digits allowed for the xs:decimal data type 128 29
    xs:duration: duration with second part specified, but has no digits error no error, infer a 0 for the second part

  • XML Editor in VS 2005: Did you know?

    Did you know that there are multiple ways of associating a schema with an XML document in the XML Editor?  The following list describes all of them in the order in which the XML Editor will look for them

     

    1. Schemas Property on your XML document
    2. Inline inside your XML document
    3. xsi:schemaLocation or xsi:noNamespaceSchemaLocation attributes in your XML document
    4. Open Document Window.
    5. Anywhere in your current Project
    6. In the Schema Cache Directory or from a Schema Catalog file.

    This can be useful if you are working with multiple versions of the same schema.  If the XML Editor picks a different schema than the one you want it to use for a particular XML file, then you can always override that choice by editing the Schemas Property on that XML file. 

     

    Note that if you rename the schema that your XML document is referencing, then the Schemas Property will be automatically updated to point to the new filename.  And if you delete the schema that your XML document is referencing, then the Schemas Property will be cleared and the XML Editor will try to find a new schema using locations and order listed above.

     

  • XML Editor in VS2005: Schema Cache and Schema Catalogs

    Below are a few tips and tricks on using schema cache and schema catalogs with the new XML Editor in VS 2005.

    Schema Cache

    The XML Editor comes with a set of schemas that describe some of the W3C standard and some of the VS specific XML namespaces.  These schemas are installed into your VS installation directory under %vsinstalldir%\xml\schemas.  When you declare one of the namespacess defined by these schemas in your XML files, the XML Editor will automatically associate appropriate schema(s) from the cache location and instantly provide you with intellisense data and validation.  The purpose of the schema cache directory is to hold standard schemas as well as schemas that are unlikely to change.

    The following operations are supported by the XML Editor without requiring a VS restart.
    - Adding schemas
    - Deleting schemas
    - Renaming schemas

    You can also change the schema cache location.  This is done by modifying Cache Directory Location on Tools | Options | Text Editor | XML | Miscellaneous dialog.  When you do this, the XML Editor will stop using schemas from the old location and instead switch to the new folder.

    Schema Catalogs

    You can extend existing schema cache using catalog.xml file.  VS installs a sample catalog.xml file (along with catalog.xsd) in the schema cache folder.  This file can be used to do several things.  First of all, you can associate namespaces with external locations as in the following example:

    <Schema href="%VsInstallDir%/Common7/IDE/Policy/Schemas/TDLSchema.xsd"

       targetNamespace="http://www.microsoft.com/schema/EnterpriseTemplates/TDLSchema"/>

    You can also associate file extensions with specific namespaces.  The following line taken from the sample catalog.xml associates .config files with dotNetConfig.xsd schema

    <Association extension="config" schema="%VsInstallDir%/xml/schemas/dotNetConfig.xsd"/>

    Finally, you can point your catalog to another catalog file creating a chain:

    <Catalog href="http://mycompany/catalog.xml"/>

     

  • Walking XmlSchema and XmlSchemaSet objects

    I’ve seen a number of newsgroup posts asking how to find a particular element or how to get a list of all elements from either XmlSchema or XmlSchemaSet objects.  Since we don’t provide this functionality in the framework, you need to manually traverse these objects to get what you want.  Depending on what your goal is, you might need to get either pre-compile or post-compile information.  For example, named groups are not available post-compile while PSVI information is not available pre-compile.  In this post I’ll show you how you can get pre-compile information from these objects.  To make the reading easier, I don’t include any error or exception handling code which is not relevant.

     

    I’m assuming that you have a SchemaSet with a few schemas added to it.  SchemaSet provides collections of global elements, types, and attributes.  However, these collections are empty until you compile the set.  Note that named group collection is not available in the SchemaSet.  To get the pre-compile info (including a list of named groups), you will need to go through each schema in the set.

     

    foreach (XmlSchema schema in ss.Schemas())

    {

    }

     

    Once you have an XmlSchema object, you can step through and parse each global

     

    // stepping through global complex types

    foreach (XmlSchemaType type in schema.SchemaTypes.Values)

    {

    if (type is XmlSchemaComplexType)

          {

    }

    }

     

    // stepping through global elements

    foreach (XmlSchemaElement el in schema.Elements.Values)

    {

    }

     

    // stepping through named groups

    foreach (XmlSchemaAnnotated xsa in schema.Items)

    {

    if (xsa is XmlSchemaGroup)

    {

    }

    }

     

    Now that we have a global, whether it’s a type, an element, or a group, how do we traverse it?  I’m going to use a recursive method that takes an XmlSchemaParticle to do it.

     

    void walkTheParticle(XmlSchemaParticle particle)

    {

        if (particle is XmlSchemaElement)

        {

            XmlSchemaElement elem = particle as XmlSchemaElement;

     

            // todo: insert your processing code here

     

            if (elem.RefName.IsEmpty)

            {

                XmlSchemaType type = (XmlSchemaType)elem.ElementSchemaType;

                if (type is XmlSchemaComplexType)

                {

                    XmlSchemaComplexType ct = type as XmlSchemaComplexType;

                    if (ct.QualifiedName.IsEmpty)

                    {

                        walkTheParticle(ct.ContentTypeParticle);

                    }

                }

            }

        }

        else if (particle is XmlSchemaGroupBase)

        //xs:all, xs:choice, xs:sequence

        {

            XmlSchemaGroupBase baseParticle = particle as XmlSchemaGroupBase;

            foreach (XmlSchemaParticle subParticle in baseParticle.Items)

            {

                walkTheParticle(subParticle);

            }

        }

    }

     

    If the particle passed to the walkTheParticle is a base group (all, choice, or sequence), we will loop through each element within this base group.  If the particle is an element, we will do our processing and then (if the element is of a complex type) walk through it.  Finally, below are the last touches to the calling method to make it all work.

     

    void start(XmlSchemaSet ss)

    {

        foreach (XmlSchema schema in ss.Schemas())

        {

            foreach (XmlSchemaType type in schema.SchemaTypes.Values)

            {

                if (type is XmlSchemaComplexType)

                {

                    XmlSchemaComplexType ct = type as XmlSchemaComplexType;

                    walkTheParticle(ct.ContentTypeParticle);

                }

            }

     

            foreach (XmlSchemaElement el in schema.Elements.Values)

            {

                walkTheParticle(el);

            }

     

            foreach (XmlSchemaAnnotated xsa in schema.Items)

            {

                if (xsa is XmlSchemaGroup)

                {

                    XmlSchemaGroup xsg = xsa as XmlSchemaGroup;

                    walkTheParticle(xsg.Particle);

                }

            }

        }

    }

     

    That’s about it.  Let me know if you have any questions.

  • W3C and XML Schema

    Paul Downey and C. M. Sperberg-McQueen just published a report on "XML Schema 1.0 User Experiences" that W3C organized last month.  Also, W3C has started a new XML Schema wiki site.
  • XML Editor and Code Snippets

    Introduction

    How many times in your coding life have you had to write or copy-paste code that was almost identical to something that you’d done before? For example, when working on a new xml schema, how many times did you have to write things like

    <xs:complexType name="fooType">

      <xs:sequence>

        <xs:element name=""/>

         

      </xs:sequence>

    </xs:complexType>

    I know, I know, you can use graphical Schema Editors where you can just point-and-click to generate this code. But can you point-and-click to generate code for a complex type with a sequence of three elements and an xs:any as in the following example?

    <xs:complexType name="fooType">

      <xs:sequence>

        <xs:element name="element1"/>

        <xs:element name="element2"/>

        <xs:element name="element3"/>

        <xs:any namespace="##other" processContents="strict"/>

      </xs:sequence>

    </xs:complexType>

    No? I didn’t think so.

    The XML Editor is going to help you with this. VS2005 introduced a new and very powerful feature called Code Snippets. A code snippet is an XML file (with a .snippet extension) that contains a chunk of code in which you can configure different parameters, reference assemblies, and use references and import statements. Think of them as templates that you can configure to meet your needs. The XML Editor supports this feature and also adds dynamically generated code snippets based on your XML Schemas.

    Standard (Static) Snippets

    Static snippets are available not only in XML Editor, but also in C# and VB. They are created by you or somebody else and stored as .snippet files for future use. When you install XML Editor, you get a number of snippets that we have created for you. For example, a snippet for a new simple type with a pattern restriction:

    <xsd:simpleType  name="name" xmlns:xsd="http://www.w3.org/2001/XMLSchema">

      <xsd:restriction base="xsd:string">

        <xsd:pattern value=" "/>

      </xsd:restriction>

    </xsd:simpleType>

    or a C# script block to use in your xslt:

    <ms:script implements-prefix="user" xmlns:ms="urn:schemas-microsoft-com:xslt" language="C#" >

        double my_code() {

         

        }

    </ms:script>

    To see a list of all available snippets, go to the Tools menu, select Code Snippets Manager, and then select XML from the dropdown. To insert a snippet, right-click and select “Insert snippet…” (also available from Edit | Intellisense | Insert Snippet or Ctrl+K Ctrl+X). If you are using beta2, you might need to hit SPACE to bring up a list of available snippets. Once you’ve inserted a snippet, you can tab between various modifiable parameters of the snippet, which will be highlighted. The snippet will be “committed” when you hit Enter. Also, note that each snippet can have a shortcut, which you can use to insert it. For example, “stpattern” is a shortcut for the “simple type with a pattern restriction” snippet shown above. To use this shortcut, type “<stpattern” (no closing “>”) and hit TAB. Make sure that before you hit TAB, intellisense drop-down window is not shown (you can hit ESCAPE if it is).

    You can also write your own static snippets. All you need to do is to create an xml file in accordance to the schema for snippets, which is installed with VS. There are multiple articles that describe how to create your own snippets for VS. For example, “Code Snippet – Schema Description” by Sean Laberee. However, what these articles fail to mention is that we provide you with a “Snippet for Snippets”. That’s right, all you need to do is create a new XML file, type “<snippet”, hit TAB and fill in the blanks. Now you can create your own libraries of snippets and even share them with your friends and co-workers.

    Dynamic Snippets

    Dynamic snippets, unlike static snippets, are a unique feature of the XML Editor and are only applicable when working with instance documents that have a schema associated with them. Let’s say you are creating an xml document for a purchase order which contains order ID, customer name, product name, order date, shipping address, and a bunch of other elements. While XML Editor helps you with this by providing intellisense options, it is still a lot of typing. Now in order to provide you with intellisense, the Editor had to parse the schema and at this point knows perfectly well what the structure of the purchase order is. So why not just print it there and let you fill in the values? And that’s exactly what dynamic snippets are. You invoke them in a manner similar to the static snippets – simply type TAB after the element name (for example, “<purchaseOrder”) to get all required attributes and child content pre-populated for you.

    Try them out and let me know what you think.


© 2008 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker