Stan Kitsis

XML Tools and all things schema in system.xml and MSXML

Walking XmlSchema and XmlSchemaSet objects

I’ve seen a number of newsgroup posts asking how to find a particular element or how to get a list of all elements from either XmlSchema or XmlSchemaSet objects.  Since we don’t provide this functionality in the framework, you need to manually traverse these objects to get what you want.  Depending on what your goal is, you might need to get either pre-compile or post-compile information.  For example, named groups are not available post-compile while PSVI information is not available pre-compile.  In this post I’ll show you how you can get pre-compile information from these objects.  To make the reading easier, I don’t include any error or exception handling code which is not relevant.

 

I’m assuming that you have a SchemaSet with a few schemas added to it.  SchemaSet provides collections of global elements, types, and attributes.  However, these collections are empty until you compile the set.  Note that named group collection is not available in the SchemaSet.  To get the pre-compile info (including a list of named groups), you will need to go through each schema in the set.

 

foreach (XmlSchema schema in ss.Schemas())

{

}

 

Once you have an XmlSchema object, you can step through and parse each global

 

// stepping through global complex types

foreach (XmlSchemaType type in schema.SchemaTypes.Values)

{

if (type is XmlSchemaComplexType)

      {

}

}

 

// stepping through global elements

foreach (XmlSchemaElement el in schema.Elements.Values)

{

}

 

// stepping through named groups

foreach (XmlSchemaAnnotated xsa in schema.Items)

{

if (xsa is XmlSchemaGroup)

{

}

}

 

Now that we have a global, whether it’s a type, an element, or a group, how do we traverse it?  I’m going to use a recursive method that takes an XmlSchemaParticle to do it.

 

void walkTheParticle(XmlSchemaParticle particle)

{

    if (particle is XmlSchemaElement)

    {

        XmlSchemaElement elem = particle as XmlSchemaElement;

 

        // todo: insert your processing code here

 

        if (elem.RefName.IsEmpty)

        {

            XmlSchemaType type = (XmlSchemaType)elem.ElementSchemaType;

            if (type is XmlSchemaComplexType)

            {

                XmlSchemaComplexType ct = type as XmlSchemaComplexType;

                if (ct.QualifiedName.IsEmpty)

                {

                    walkTheParticle(ct.ContentTypeParticle);

                }

            }

        }

    }

    else if (particle is XmlSchemaGroupBase)

    //xs:all, xs:choice, xs:sequence

    {

        XmlSchemaGroupBase baseParticle = particle as XmlSchemaGroupBase;

        foreach (XmlSchemaParticle subParticle in baseParticle.Items)

        {

            walkTheParticle(subParticle);

        }

    }

}

 

If the particle passed to the walkTheParticle is a base group (all, choice, or sequence), we will loop through each element within this base group.  If the particle is an element, we will do our processing and then (if the element is of a complex type) walk through it.  Finally, below are the last touches to the calling method to make it all work.

 

void start(XmlSchemaSet ss)

{

    foreach (XmlSchema schema in ss.Schemas())

    {

        foreach (XmlSchemaType type in schema.SchemaTypes.Values)

        {

            if (type is XmlSchemaComplexType)

            {

                XmlSchemaComplexType ct = type as XmlSchemaComplexType;

                walkTheParticle(ct.ContentTypeParticle);

            }

        }

 

        foreach (XmlSchemaElement el in schema.Elements.Values)

        {

            walkTheParticle(el);

        }

 

        foreach (XmlSchemaAnnotated xsa in schema.Items)

        {

            if (xsa is XmlSchemaGroup)

            {

                XmlSchemaGroup xsg = xsa as XmlSchemaGroup;

                walkTheParticle(xsg.Particle);

            }

        }

    }

}

 

That’s about it.  Let me know if you have any questions.

Published Saturday, August 06, 2005 2:25 PM by skits
Filed under: ,

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Comments

 

Microsoft XML Team's WebLog said:

Stan writes about Walking XmlSchema and XmlSchemaSet objects
August 6, 2005 6:25 PM
 

Moshe Gershberg said:

Looks cool. What about elements in xsd file that have minoccurs = 0 and maxoccurs = 0?
I have a real situation when I need to iterate through the elements of my schema and recognize those that have maxOccurs = 0?

My email is Moshe_Gershberg@Coutrywide.com
June 5, 2006 10:25 AM
 

Roy said:

Stan,

If I have an XSD defining

  Simple Type (like an integer)

  Complex Type

  Simple Type (like an integer)

  Complex Type

and then I do a GetXml() on my DataSet, the resulting XML comes out with all the Simple Types first and then all the Complex Types.

i.e.

  Simple Type (like an integer)

  Simple Type (like an integer)

  Complex Type

  Complex Type

Is there a way I can have GetXml() keep things in the same order as the XSD?

Thanks.

Roy

April 6, 2007 5:33 AM
 

Community Blogs said:

I've been working on an application that is essentially a data processing pipeline. Due to the nature

May 11, 2007 8:04 AM
 

robert said:

ok, how do you determine which schema each element belongs to.

July 23, 2007 10:34 AM
 

dinhny said:

Hi, I don't understand why check for elem.RefName.IsEmpty and ct.QualifiedName.IsEmpty in your example code?

if (elem.RefName.IsEmpty)

       {

           XmlSchemaType type = (XmlSchemaType)elem.ElementSchemaType;

           if (type is XmlSchemaComplexType)

           {

               XmlSchemaComplexType ct = type as XmlSchemaComplexType;

               if (ct.QualifiedName.IsEmpty)

               {

                   walkTheParticle(ct.ContentTypeParticle);

               }

           }

       }

September 6, 2007 6:14 PM
 

Steve Marshall said:

I also don't understand why the code checks for elem.RefName.IsEmpty.  What should it do if the RefName is NOT empty?

I've been using some code modelled on yours to search a single schema quite successfully.  But my app now uses a very complex set of schemas, and the code no longer finds things.  My feeling is that it is because there are a lot of elements with RefNames, and the code is not handling it.  But my knowledge of XmlSchema internals is not good enough to work out what needs to be changed.  Any suggestions?  Basically I want to hand a function a string like an Xpath, and get back a schema element for the leaf node.

March 5, 2008 11:34 PM
 

Stan said:

Steve,

The reason to check for RefName is to distinguish between locally defined elements (refname is empty) and references to global elements (ref name is NOT empty).  In some cases you might want to skip references, in others not.  It sounds like in your case you don't want to differentiate between the two.

March 7, 2008 8:54 PM
 

Stan said:

Checking for ct.QualifiedName.IsEmpty is similar to checking for elem.RefName.IsEmpty - the main goal here is to distinguish between global types and locally defined types.

March 7, 2008 9:01 PM
 

Josh said:

I've got a schema snippit that looks like this:

 <xs:element name="TimeOfDay">

   <xs:complexType>

     <xs:all>

       <xs:element minOccurs="1" maxOccurs="1" name="Time" type="ValidTime" />

       <xs:element minOccurs="0" maxOccurs="1" name="Tolerance" type="ValidTolerance" />

     </xs:all>

   </xs:complexType>

 </xs:element>

When it goes through the code, elem.ElementSchemaType is null, so it doesn't get any particles off of it. I'm not sure why it's null though - it looks like a complexType to me. Is my schema wrong? It validates stuff properly...

March 10, 2008 9:28 PM
 

skits said:

What are ValidTime and ValidTolerance types?  They have to be either simple types or complex types with simple content for the schema to be valid.  Also, what element are you on when elem.ElementSchemaType is null?

March 11, 2008 6:59 PM
 

Josh said:

ValidTime is a simple type:

 <xs:simpleType name="ValidTime">

   <xs:restriction base="xs:int">

     <xs:minInclusive value="0" />

     <xs:maxInclusive value="2359" />

   </xs:restriction>

 </xs:simpleType>

ValidTolerance is similar. I expected them to flip through quickly, which they do - it was on the TimeOfDay element when ElementSchemaType is null.

March 11, 2008 9:53 PM
 

Jennifer said:

Hi Stan. Very helpful - thanks!

What I'm trying to do is opening an XSD file and listing all its elements and their attributes. So far I've succeeded in listing all the elements but I can't seem to find a way to list all the attributes related to a particular element. Any help please? Below is part of the XSD file I'm working with.

<xs:element name="ARTICLE">

<xs:complexType>

<xs:sequence>

<xs:element ref="HEADLINE"/>

<xs:element ref="BYLINE"/>

<xs:element ref="LEAD"/>

<xs:element ref="BODY"/>

<xs:element ref="NOTES"/>

</xs:sequence>

<xs:attribute name="AUTHOR" type="xs:anySimpleType" use="required"/>

<xs:attribute name="EDITOR" type="xs:anySimpleType"/>

<xs:attribute name="DATE" type="xs:anySimpleType"/>

<xs:attribute name="EDITION" type="xs:anySimpleType"/>

</xs:complexType>

</xs:element>

April 14, 2008 10:10 AM
 

Jonas Scalar said:

This code is great.

Have a wai to get the parent name? and the Type def of the element?

January 13, 2009 7:27 PM
 

Waltzing with WSDL &laquo; Santosh Benjamin&#8217;s Weblog said:

January 21, 2009 6:56 PM
 

comment parser un fichier xsd | hilpers said:

January 22, 2009 11:22 AM
 

Stan Kitsis Walking XmlSchema and XmlSchemaSet objects | Quick Diets said:

June 9, 2009 11:45 PM

Leave a Comment

(required) 
(optional)
(required) 

  
Enter Code Here: Required
Submit

© 2009 Microsoft Corporation. All rights reserved. Terms of Use  |  Trademarks  |  Privacy Statement
Microsoft
Page view tracker