Hi everyone.  My name is Steve Ernst and I am a Software Design Engineer on the Microsoft Business Framework team.  I’m pretty new to this whole blog thing so please have some patience with me as I get the hang of it.

 

My two primary areas of interest are making music with my band mates and all things concerning software development.  Although very passionate about both, this blog will be focused on the later.

 

To start things off, I’ve been thinking a lot lately about some of the edges of serialization.  Most serialization scenarios are pretty straightforward and consequently not very interesting.  The really juices parts are where things start to loose their determinism.

 

Picture the following scenario (I know, I know, who codes in C++ anymore?  Not me anymore at least ;-)  I’ve chosen to use C++’s union operator to lead off this topic because it is easy to understand and fits perfectly.

 

union foo {

     int x;

     string bar;

}

 

The type foo has two possible values, an integer or a string.  If you remember the definition of union, only one of the two possible values can be specified at a time.  So, imagine having a function g that returns an XSD for any possible type.  What would g return if passed foo?

 

XSD xsdResult = g(typeof(foo));

 

Obviously, we need an XSD that indicates that foo could contain either an integer or string but not both.  Maybe it would look something like this.

 

<xsd:simpleType name=”foo”>

     <xsd:union memberTypes=”int string” />

</xsd:simpleType>

 

The resulting xml might then look like one of the following:

 

<foo>5</foo>

 

or

 

<foo>whoa</foo>

 

So, when our function g generates the schema for a type, we know there are going to be cases where we will need to use the xsd:union element in the schema.  The actual C++ union work we did before probably won’t be one of those cases, but I can think of another.  Take a look at the following code:

 

public class Parent1 {

     string aMember;

     Child child;

}

 

public class Parent2 {

     string bMember;

     Child child;

}

 

Public class Child {

     object parent;

}

 

It’s very easy to see that the schemas generated by g(typeof(Parent1)) and g(typeof(Parent2)) are pretty simple as everything is strongly typed.  This is not the case for g(typeof(Child)).  The loosely typed object reference to the child’s parent is going to be a problem.  We don’t know what the real type of the parent is until runtime and we have a real instance of a Parent and Child pair.  Sounds like another perfect opportunity for the xsd:union clause.

 

<xsd:simpleType name=”Parent1”>

     <xsd:element name=”aMember” type=”xsd:string” />

     <xsd:element name=”child” type=”Child” />

</xsd:simpleType>

 

<xsd:simpleType name=”Parent2”>

     <xsd:element name=”bMember” type=”xsd:string” />

     <xsd:element name=”child” type=”Child” />

</xsd:simpleType>

 

<xsd:simpleType name=”ParentRef”>

     <xsd:union memberTypes=”Parent1 Parent2” />

</xsd:simpleType>

 

<xsd:simpleType name=”Child”>

     <xsd:element name=”parent” type=”ParentRef” />

</xsd:simpleType>

 

Ahh, short and sweet.  We now have a way of clarifying the schema.  This can of course get pretty hairy depending upon the complexity of a child object’s ancestry but it is certainly tractable.

 

I’ll admit that most of this stuff has been pretty elementary thus far.  For the next installment I’ll add in some complication and we’ll start to flesh out exactly what g would have to look like to solve the “schema for any type” problem.