Typed XML programmer -- What’s your scale of typing?

Typed XML programmer -- What’s your scale of typing?

  • Comments 1

Typed XML programmer, where have you have been in the last few weeks?” This series of blog posts has been stuck because I attended a bunch of conferences and workshops, and I was really busy with an incubation project on typed XML programming (see later in this series). I admit that I also worked on untyped XML programming -- the streaming part of it -- cool stuff … I will be writing about it when I get a chance.

 

 

Executive summary

 

XML programmer -- here is your scale of typing:

 

  1. Text.
  2. Syntactically correct XML data.
  3. Well-formed XML data.
  4. Validated XML data.
  5. Intellisensially typed XML literals and XML queries.
  6. Static types for XML data in objects with type erasure.
  7. Static types for XML data in objects without type erasure.
  8. Types that obey XML semantics and schema constraints.

 

(Let me recall that, in the present series, I focus on XML programming when embedded in an OO programming language.) I really don’t think of the 0-7 scale as a strict progression of typing for reasons that will become evident soon. Also, I contend that typing discussions all too easily focus on type checking (and perhaps IntelliSense), whereas I will make sure to recall other important roles of types.

 

Let’s quickly walk through the 0-7 scale of typing before I detail the more interesting levels in designated sections of this post. An untyped XML program typically operates at levels 2-3, but there are applications (e.g., an XML editor) that exercise the lower levels 0-1 because they must smoothly handle XML data that is (at least transiently) ill-formed or syntactical incorrect. Level 3, validated XML data, means that some XML data (such as a DOM tree) is valid with regard to a given schema. I prefer saying validated XML data over saying valid XML data since I wish to point out that this level stands for checking validity explicitly such as by invoking a validation method on a DOM tree.

 

Levels 0-3 basically concern properties of XML data without involving yet the type system of a programming language. Typing (including type checking) only starts at the levels above. Level 4, intellisensially typed XML literals and XML queries, aims to add design-time support for XML literals (say, ‘XML data in code’) and some axes for XML queries (such as the child axis). Level 5 and 6 leverage type-system capabilities to model XML data as objects of types that are derived from a schema, so that ‘normal’ static type checking of the OO language at hand applies. (When the object systems of mainstream languages are used, then we naturally operate at the non-erasing level 6, that is, types remain at run-time, but a more lightweight level 5 is conceivable, which I mention here just for the good order -- also because a retrofitted IntelliSense-biased approach may count as a type-erasing option.)

 

I reserve the highest level on the scale for a typed XML programming setting that facilitates a combination of static and dynamic typing, as well as the appropriate semantics of the types, so that the resulting objects obey XML semantics and schema constraints. I propose the term ‘XML objects’ to this end. In my humble opinion, a state-of-the-art approach for typed XML programming should strive for level 7.

 

 

From plain objects to XML objects

 

XML objects care about …

 

  • XML semantics (tree status, comments, PIs, ...);
  • static typing for queries (such as the child axis);
  • structural integrity with regard to complex content;
  • value integrity with regard to simple content;
  • the nominal abstractions in schemas;
  • default values and fixed values;
  • XSD documentation elements.

 

When XML content is constructed or queried through XML objects, then the ‘XML-ishness’ of the data is handled in a non-sloppy way and many programming errors (that could lead to invalid data) are ruled out or discouraged. In reality, XML objects may be imperfect, so as to account for performance and simplicity, or as a result of language limitations. Hence, real-world XML objects may compromise on some of the bullets above.

 

Let’s illustrate XML objects by an example that deals with XSD’s default values. The following XSD complex type defines a content model for addresses. I have highlighted an attribute declaration with a default: the country of an address element, when not present, is assumed to be “USA” per default:

 

  <xs:element name="Address">

   <xs:complexType>

    <xs:sequence>

     <xs:choice>

      <xs:element name="Street" type="xs:string"/>

      <xs:element name="POBox" type="xs:int"/>

     </xs:choice>

     <xs:element name="City" type="xs:string"/>

     <xs:element name="Zip" type="xs:int"/>

     <xs:element name="State" type="xs:string"/>

    </xs:sequence>

    <xs:attribute name="Country" type="xs:string" default="USA"/>

   </xs:complexType>

  </xs:element>

 

Assuming an appropriate X-to-O mapping, we construct an address object as follows:

 

// Let’s use C# 3.0 object initializer syntax

Address adr = new Address {

Street = "123 Main St",

City   = "Mercer Island",

Zip    = 68042,

State  = "WA" };

 

What should be printed by the following statement?

 

Console.WriteLine(adr.Country);

 

Of course, we want to see the declared default: “USA”.

However, when writing out the XML, no country should be set:

 

<Address xmlns="http://www.example.com/Address">

  <Street>123 Main St</Street>

  <City>Mercer Island</City>

  <Zip>68042</Zip>

  <State>WA</State>

</Address>

 

The example substantiates that XML objects need to go beyond the naïve formula ‘map complex types to classes, map element particles to fields’. I contend that object types for XML processing are best thought of as abstract data types (ADTs). This status is almost a prerequisite to the faithful implementation of XML semantics, integrity checking, and defaults. Just for fun I encourage you to try out the above example with your favorite typed XML programming approach.

 

 

Typing where possible

 

One of the myths of X-to-O mapping is the mere expectation that all constraints from any XML schema should ideally be handled by static type checking of the OO language at hand. This expectation is not realistic for mainstream languages. Let’s not get lured into the despair of the X/O impedance mismatch, and let’s try out a more flexible and expressive setup based on the formula ‘static typing where possible, dynamic typing when needed’.

 

Here are two archetypal examples for the dichotomy ‘static vs. dynamic’. It is straightforward to enforce type correctness for the child axis statically in so far that only declared element names are accepted and the corresponding OO properties are of the (mapped) element types. It is impractical to statically enforce schema constraints such as:

 

  • maxOccurs="42"
  • <xs:maxInclusive value="88"/>

 

These constraints do not come with an obvious or useful counterpart in the typical type system. The constraints may still be checked by run-time assertions in the setters and/or getters for the relevant subtrees.

 

So far, this sounds all great, but there is no guarantee that it is easy to decide whether to use static or dynamic typing. For instance, when and how to exactly check key constraints? Of course, explicit validation is always an option, but what if we wanted to check the key constraints more ‘automatically’, that is, through the type system? Let’s be modest and think of dynamic typing, first. Checking key constraints instantly (upon each modification of an XML tree) may be impractical due to the need for some sort atomicity for related modifications. Perhaps, statically typed idioms could be devised such that they address the atomicity requirement? I need to think of that.

 

Here is a bummer. There is one overall topic in typed XML programming with XML objects that calls for static typing, but the typical OO type system is too weak. Worse than that, dynamic typing is no good replacement. I am talking about object construction. We might be facing a nonmythical substantiation of the X/O impedance mismatch.

 

Look at this type-checked code for constructing an address:

 

Address oops = new Address {

Street = "123 Main St",

POBox  = 456789 };

 

Not just do we succeed to nonsensically define a street and a POBox, the static typing at hand also fails to tell us about omitted assignments to City, Zip and State. I have tried out this scenario with various folks by now. Occasionally, I encounter protest: “No problem -- just define appropriate non-default constructors, and perhaps even disable the default constructor!” So we would get constructors as follows:

 

  Address(string street, string city, int zip, string state) {…}

  Address(int pobox, string city, int zip, string state) {…}

 // ... and two more for the case that country is present

 

I only say two words: explosion (think of just slightly bigger content models perhaps with a few more optional particles) and anonymity (think of each constructor invocation listing just plain string or integer literals without any clarifying selector names). These two issues alone are show stoppers.

 

With regard to dynamic typing, it turns out that there is no obvious point along program execution when the actual content could be checked against the schema-defined content model. I have played with a technique that I call ‘shallow validation upon parenting’, but I am not going to sell this here as a proper solution. The fact that OO languages like Java, C# and even Comega let us down is disappointing.

 

As a result, ‘static typing where possible, dynamic typing when needed’ does not quite appeal in this case. It’s more like ‘static typing where possible, else dynamic typing where possible; no typing otherwise’. Fortunately, (I contend that) XML objects are not challenged by any other problem of similar severity. Hence, I propose to work on a practical solution of the construction problem so as to make the life of typed XML programmers more … well typed. A good part of this problem has been researched in the XDuce and XTatic projects.

 

 

Type checking vs. validation

 

Let’s go through some progression of questions and opinions:

 

  • So XML objects add type checking for XML processing code. Great!
  • Do I still need to validate my XML data when using XML objects?
  • More specifically, what about validating input data? … output data?
  • In different terms, how much validation do I get from type checking?
  • I am afraid that I end up with invalid content despite typing.
  • I am also afraid that validation may be performed multiple times.

 

Let’s first get the key terms right, without getting too formal:

 

  • Type checking imposes a discipline on programming such that ‘nothing can go wrong’, say when an ‘operation’ is applied to some data (say, a method is applied to its arguments and invoked on an object), then static and dynamic type checking has established the operation’s applicability ahead of time so as to rule out corrupt data and abnormal program states.

 

  • By contrast, validation is an operation (‘a predicate) that the programmer applies to in-memory or serialized data. In reality, we expect to get more feedback than true or false, and be able to provide handlers to recover from certain validation errors. By the way, validation is not XML-specific; consider Cobol’s designated language concepts for data validation.

 

Here is the question: How are type checking and validation aligned in typed XML programming? You could take the position that the validity of XML trees should be a basic invariant to be enforced by the type system. Hence, any query, any construction and any mutation would need to guarantee that enough validation is perform before exposing the result to the program. Such ‘validity at all times’ is theoretically possible, but computationally expensive and difficult to reconcile with the expressiveness and the idiomatic use of mainstream programming languages as well as other requirements for XML programming.

 

Here is a simple example:

 

var adr = new Address();

adr.Street = "123 Main St";

adr.City   = "Mercer Island";

adr.Zip    = 68042;

adr.State  = "WA";

 

The point being as long as we allow ourselves to use default constructors, ‘validity at all times’ is infeasible unless we were assuming that some default values would be generated for all subtrees. Of course, we could think of a clever program analysis in the above example that establishes the fact that the complete statement sequence delivers a complete address without exposing the incomplete address object. So perhaps the example at hand isn’t strong -- also because the ‘imperative construction idiom’ used above could soon belong to the past era of ‘disfunctional programming’.

 

However, there is stickier problem. The formula ‘validity at all times’ would imply that any XML data source needs to be fully validated before we were allowed to apply any operation to it. Not just does this rule out XML streaming quite brutally, it also implies considerable ‘start-up’ costs for the access to an XML data source.  Suppose you have an untyped XML tree sitting somewhere and you want to cast it to a typed XML tree, say for an address; what do you expect to happen?

 

            // Creating a typed view with LINQ to XML and friends

var xe = XElement.Load("Adddress.xml");

var adr = (Address)xe;

 

Are you willing to pay the penalty of a full validation pass? In addition to the issues  of performance and streaming capability, what if you want to account for robustness of your XML processing code in the sense that it does not throw as long as the input is ‘valid enough’ to provide the information needed by your query? All these considerations lead me to favor what might be called best demand-driven validation.

 

With such demand-driven validation, the above cast from XElement to Address will solely perform a shallow test on the rooting element tag of the untyped XML tree -- just enough to establish that we are facing an address. The cast operation, by itself, does not take any dependency on the precise content model of the address; so we defer all remaining data inspection until the relevant data is demanded. Let’s descend into the typed XML tree:

 

// Print the zip, which is an integer

Console.WriteLine(adr.Zip);

 

Now what sorts of invalidity could we run into?

 

  1. There is no Zip element, even though it is mandatory per schema.
  2. There is a Zip element, but its data type is not quite xs:int.
  3. There is a valid Zip element, but as part of an invalid content:
    1. An order constraint is violated.
    2. There is more than one Zip element.
    3. … reasons related to other elements …

 

Re: 1. -- if there is no Zip element, I recommend throwing because my code takes a dependency on the zip code being there, and it is risky to let it go with any sort of undeclared default such as 0.

 

Re: 2. -- if the data type is incorrect for the Zip element, again I recommend throwing, for the same reason as above. I can imagine to handle ‘demand-driven validation errors’ by means of handlers that the programmer associates with typed XML trees.

 

Re: 3.a, b, c -- I recommend adopting the demand-driven regime again as opposed to throwing. All these cases can be handled by a ‘pick first’ semantics, except for some complicated content models such as those with wildcards -- they require extra scrutiny.

 

 

Intellisensial XML types

 

The classic view on X-to-O mapping assumes the generation of more or less ‘normal’ object models from XML schemas so as to leverage all benefits of typed OO programming for XML processing from there on. As a welcome side effect, IntelliSense kicks in nicely and helps with querying and constructing XML data. In fact, this side effect, by itself, makes people buying into X-to-O mapping.

 

Now, why bother about object models, if IntelliSense is what you are really after? The following progression of capabilities provides us with schema-based IntelliSense without engaging in dreaded X-to-O mapping:

 

  1. Schema-based IntelliSense for XML editing is, by now, a standard capability of an XML editor. That is, once the target namespace of an XML instance is known, the associated schema can be used to guide XML editing.

 

<Address xmlns="http://www.example.com/Address">

  <Street>123 Main St</Street>

  <City>Mercer Island</City>

  <Zip>68042</Zip>

    … at this point a pop-up would tell you to fill in a <State> child …

</Address>

 

  1. XML literals provide a way to add XML syntax to a programming-language syntax such that XML elements and other XML data (potentially even with ‘expression holes’) can appear in the program text in their native notation. A typical implementation of such a capability rests on a mapping of XML literals to expressions (or statement sequences) of an available, generic XML API in the programming language at hand. For instance, an address element can be mapped to the following LINQ to XML expression, where we use functional construction in ways that closely resemble the shape of the XML literal:

 

     XNamespace ns = "http://www.example.com/Address";

     XElement xml =

       new XElement(ns + "Address",

         new XAttribute("xmlns", ns),

         new XElement(ns + "Street", "123 Main St"),

         new XElement(ns + "City", "Mercer Island"),

         new XElement(ns + "Zip", "68042"),

         new XElement(ns + "State", "WA"));

 

 

  1. Now we combine 1. and 2. such that schema-based editing of XML literals can be carried out during the normal IntelliSense-supported program editing process. Other than ‘embedding’ an XML editor in the program editor, this amalgamation also requires some tweaks such as enabling the XML editor to pick up target namespaces from the program scope, extending the program syntax to bring target namespaces into scope, and switching back to programming-language syntax for expression holes.

 

  1. Schema-based IntelliSense goes naturally beyond editing of XML literals; it should also apply to XML queries. For instance, when exercising the child or descendant axes for a generic XML tree, one would want to get helpful pop-ups regarding the possible element tags. Again, this capability requires some amalgamation efforts so that reasonably precise sets of element tags can be proposed.

 

This is one way to read where VB9 is going. Conceptually, this path may sound simple, but technically, such an amalgamation of normal program editing and schema-based XML editing is an exciting challenge (as I can tell from conversations with the VB9/XML team). Anyway, what really matters is that developers are provided with great help to deal with untyped XML while XML schemas (when available) are used to provide IDE hints. I would love to see this stuff being integrated into Visual C# and Visual Haskell, not to mention Cobol.

 

Being a somewhat extreme ‘static typing follower’ (of the Haskell school), I do have some problems accepting an IntelliSense-based approach to typed XML programming as the ultimate solution. One obvious point is that IntelliSense is no proper replacement for type checking -- I might get too late run-time type errors and I might see (or not see) too many magic casts applied. For instance, in VB9, it is important to understand that XML member access returns a String or an IEnumerable Of XElement. Now, we could think a step further than just providing IDE hints in writing code. Say, we could want to complement IntelliSense-like tool tips with say soft-typing-like checks – perhaps delivered in the form of ‘squiggle-generating’ add-ons to the existing IDE-integrated type system. Neither the untyped nor the typed XML programmer may become much happier this way (but we would have to see). The untyped programmer may get annoyed by this clever squiggles linked to error messages talking about XSD. The typed XML programmer may still miss well … types for the reasons discussed in the final section below. In particular, typing cannot be restricted to design time -- think of default attributes, type restriction, key constraints, co-constraints, virtual methods, reflection …

 

Looking for a research challenge? This XML-biased discussion of IntelliSense + typing + embedded languages (here: XML) + embedded type systems (here: XSD) calls for a well-founded and practical approach to ‘open and composable type systems’ such that existing typed language can be enhanced by type checking rules, IntelliSense support, and all the parts related to other aspects of language support (refactoring, compilation, browsing, …). I contend that we readily can and we should make substantial progress in the typed XML programming arena without waiting for the solution to the general problem. In fact, given all the controversy regarding just (Scheme-like) macro systems, I don’t think that we will agree on the solution, even if it becomes available.

 

 

What to expect from proper types?

 

So why would we still engage in X-to-O mapping? In different words, what sorts of typing capabilities do we miss when we reduce typed XML programming to schema-aware program editing? From a manager perspective, there are two, largely equivalent arguments that are normally put forward in favor of X-to-O mapping:

 

  • I want to get out of XML’: Dealing with XML is bothersome at times; the only way to effectively stop the developers’ struggle with XML is to recast XML processing as ‘normal’ OO programming -- by the way of letting developers work with domain-specific object models as opposed to DOM (and friends), XML notation, XML namespaces and other dreaded ‘XML-isms’.

 

  • I want to work with business objects’: There is no way around the fact that XML schemas are used for schema-first contracts in information systems. However, this fact must not prevent developers, designers and architects from treating all of the business data uniformly as business objects. The role of XML and XSD may best be limited to data import/export layers of information systems.

 

From a more technical perspective, there are indeed so many established benefits of objects; reproducing all such capabilities for ‘native XML programming’ may sound like a big task and still any such XML-targeted effort does not get us out of XML:

 

  • Objects with behavior
  • Objects with additional state.
  • Use nominal types in method signatures
  • Use nominal types as bounds of generics
  • Test automation based on object models
  • GUI data binding
  • O/R mapping
  • Browsers for object models
  • Type information during debugging

 

Let’s pick the bold item: being able to associate a behavioral interface to the types defined in an XML schema appeals to me as a ‘must have’ (but I have talked to people being similarly insisting with regard to any of the bullets in the list above). For instance, assume that we want to extend business objects for addresses such that the (preexisting) virtual method, ToString, prints out an XML tree for an address in a designated manner.

 

 

XML tree

 

<Address xmlns="http://www.example.com/Address">

<Street>123 Main St</Street>

<City>Mercer Island</City>

<Zip>68042</Zip>

<State>WA</State>

</Address>

 

 

Rendered as a string

 

 

  123 Main St

  Mercer Island, WA 68042, USA         

 

 

 

The corresponding class, Address, is easily extended. We leverage the fact that schema-derived classes are set to be partial by the assumed code generator for X-to-O mapping. Hence, the developer can add slices to the schema-derived classes. The ToString implementation of Address boils down to the following slice:

 

 

namespace www.example.com.Address

{

    public partial class Address

    {

        public override string ToString()

        {

            string variablePart = null;

 

            if (this.Street != null)

                variablePart = this.Street;

            else if (this.POBox != null)

                variablePart = "PO Box " + this.POBox;

 

            return

                  variablePart + "\n"

                + this.City + ", "

                + this.State + " "

                + this.Zip + ", "

                + this.Country;

        }

    }

}

 

 

To summarize, I contend that typed XML programming (within an OO language) calls for XML objects, say objects whose types (classes) have all the capabilities of code-first classes that I would normally define for the same sort of data -- if I had not used the schema-first model. In addition and crucially, these types are supposed to adhere to XML semantics and leverage XSD constraints in a systematic, useful and nonlossy manner. This high-level requirement leaves some room as well as some challenges for designs and implementations. In the next post, I will talk about an incubation project for a LINQ-based technology that currently explores the notion of XML objects.

 

Regards,

Ralf Lämmel